Introduction: Automating Form Filling with AI
Filling out online forms manually is time-consuming. Traditional automation tools like Selenium require predefined locators, which break if the form structure changes. But what if AI could detect and fill out forms dynamically—just like a human?
In this guide, we’ll explore how AI, combined with TensorFlow and Selenium, can intelligently recognize form elements and automate form submission.
We’ll cover:
✔ Training an AI model to detect form fields
✔ Capturing and labeling form elements dynamically
✔ Testing AI predictions before deployment
✔ Submitting a real form using AI & Selenium
By the end, you’ll see how AI-powered form automation adapts to any form layout without hardcoded locators! 🚀
Step 1: The HTML Form Used for AI Training
Before training our AI model, we needed a structured form that contained different field types, such as:
- Text Fields (Name, Email)
- Dropdowns (Course Selection)
- Checkboxes (Skills, Agreement)
- Radio Buttons (Experience Level)
- Submit Button
This form served as the foundation for training our AI model.
Sample Training Form:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Test Form for AI-Based Automation</title>
</head>
<body>
<h2>Test Form for AI-Based Automation</h2>
<form id="testForm">
<label for="name">Name:</label>
<input type="text" id="name" name="name" placeholder="Enter your name"><br><br>
<label for="email">Email:</label>
<input type="email" id="email" name="email" placeholder="Enter your email"><br><br>
<label for="course">Select Course:</label>
<select id="course" name="course">
<option value="selenium">Selenium Automation</option>
<option value="python">Python for Testing</option>
<option value="java">Java for Testers</option>
</select><br><br>
<input type="checkbox" id="agree" name="agree">
<label for="agree">I agree to the terms and conditions</label><br><br>
<button type="submit" id="submitBtn">Submit</button>
</form>
</body>
</html>
This form provided AI with different field types to learn from.
Step 2: Capturing Screenshots of Form Fields for AI Training
To train AI, we needed multiple labeled images of each form field.
How we did it:
- Used Selenium WebDriver to open the form.
- Captured 50+ screenshots of each form element (text fields, checkboxes, dropdowns, etc.).
- Stored images in labeled folders for AI training.
File: take_field_screenshot.py
from selenium import webdriver
from selenium.webdriver.common.by import By
import time
import os
# Define dataset directory
dataset_dir = "dataset"
# Define categories (subfolders)
categories = ["text_field", "email_field", "dropdown", "checkbox", "radio_button", "button"]
# Ensure each category folder exists
for category in categories:
os.makedirs(os.path.join(dataset_dir, category), exist_ok=True)
# Start Selenium WebDriver
driver = webdriver.Chrome()
driver.get("http://localhost/form_test.html") # Update with your local form URL
# Wait for page to load
time.sleep(3)
# Define form elements with locators
elements = {
"text_field": driver.find_element(By.ID, "name"),
"email_field": driver.find_element(By.ID, "email"),
"dropdown": driver.find_element(By.ID, "course"),
"checkbox": driver.find_element(By.ID, "skill1"),
"radio_button": driver.find_element(By.ID, "intermediate"),
"button": driver.find_element(By.ID, "submitBtn")
}
# Capture multiple screenshots for each element with sequential numbering
for i in range(50): # Capture 50 images per element
for element_name, element in elements.items():
try:
element_path = os.path.join(dataset_dir, element_name, f"{element_name}_{i}.png")
element.screenshot(element_path)
print(f"📸 Saved: {element_path}")
time.sleep(0.5) # Small delay to avoid duplicate images
except Exception as e:
print(f"❌ Failed to capture {element_name}: {e}")
# Close browser
driver.quit()
print("✅ Images captured successfully without timestamps!")
Now AI has labeled training images for recognizing form fields.
Step 3: Training the AI Model to Recognize Form Fields
Using TensorFlow, we trained a Convolutional Neural Network (CNN) to classify form fields.
What AI learns in this step:
✔ Detecting different form elements from images
✔ Recognizing patterns in text fields, checkboxes, buttons, etc.
✔ Distinguishing between multiple form elements
File: train_model.py
import tensorflow as tf
import numpy as np
import cv2
import os
import random
# Define categories for form elements
categories = ["text_field", "email_field", "dropdown", "checkbox", "radio_button", "button"]
IMG_SIZE = 64 # Image size for model training
data_dir = "dataset/"
# Prepare dataset
training_data = []
labels = []
for category in categories:
path = os.path.join(data_dir, category)
if not os.path.exists(path):
print(f" Skipping missing folder: {path}")
continue
class_num = categories.index(category)
images = os.listdir(path)
random.shuffle(images) # Shuffle dataset for better learning
for img_name in images:
img_path = os.path.join(path, img_name)
# Ensure file is an image
if not img_name.lower().endswith(('.png', '.jpg', '.jpeg')):
print(f" Skipping non-image file: {img_path}")
continue
img_array = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
# Skip empty images
if img_array is None or img_array.size == 0:
print(f" Skipping corrupted image: {img_path}")
continue
resized_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
# Data Augmentation
flipped = cv2.flip(resized_array, 1) # Horizontal flip
blurred = cv2.GaussianBlur(resized_array, (3, 3), 0) # Blur
rotated = cv2.rotate(resized_array, cv2.ROTATE_90_CLOCKWISE) # Rotate 90 degrees
training_data.append(resized_array)
labels.append(class_num)
training_data.append(flipped)
labels.append(class_num)
training_data.append(blurred)
labels.append(class_num)
training_data.append(rotated)
labels.append(class_num)
print(f" Successfully loaded {len(training_data)} valid images for training.")
# Convert to NumPy arrays
if len(training_data) == 0:
raise ValueError("No valid training data found! Please check dataset.")
X = np.array(training_data).reshape(-1, IMG_SIZE, IMG_SIZE, 1) / 255.0
y = np.array(labels)
# Define AI Model
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(IMG_SIZE, IMG_SIZE, 1)),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(len(categories), activation='softmax')
])
# Compile and Train Model
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])
# Train model with validation split to prevent overfitting
if len(X) > 0:
model.fit(X, y, epochs=25, batch_size=16, validation_split=0.2)
model.save("form_detector.h5")
print(" AI Model Trained & Saved!")
else:
print(" No valid images found. Please capture new images and retry.")
Process:
- Loaded the dataset (images captured in Step 2).
- Preprocessed the images (grayscale conversion, resizing, and data augmentation).
- Trained the AI model to classify elements into six categories:
Text fields
Email fields
Dropdowns
Checkboxes
Radio buttons
Buttons - Saved the trained model as
form_detector.h5
.
Now AI can recognize form fields dynamically based on their appearance!
Step 4: Testing the AI Model on Form Fields
Before using AI for form filling, we tested whether it could correctly detect and classify form fields.
File: predict_element.py
import tensorflow as tf
import numpy as np
import cv2
import os
# Load trained AI model
model = tf.keras.models.load_model("form_detector.h5")
# Define categories
element_types = ["text_field", "email_field", "dropdown", "checkbox", "radio_button", "button"]
# Load images from dataset to test
test_images = [
"dataset/text_field/text_field_1.png",
"dataset/email_field/email_field_2.png",
"dataset/dropdown/dropdown_3.png",
"dataset/checkbox/checkbox_4.png",
"dataset/radio_button/radio_button_5.png",
"dataset/button/button_6.png"
]
# Predict for each test image
for img_path in test_images:
if not os.path.exists(img_path):
print(f"⚠️ Skipping missing file: {img_path}")
continue
img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
img_resized = cv2.resize(img, (64, 64))
img_array = np.array(img_resized).reshape(-1, 64, 64, 1) / 255.0
prediction = model.predict(img_array)
predicted_class = np.argmax(prediction)
print(f"🔍 Image: {img_path} → Predicted: {element_types[predicted_class]}")
How we verified AI accuracy:
- Loaded the trained AI model (
form_detector.h5
). - Provided new images (not part of the training dataset).
- Checked whether AI correctly classified each form element.
This step ensured that AI could generalize its learning to new forms rather than memorizing specific layouts.
AI is now ready for real-world form automation!
Step 5: AI-Powered Form Submission Using AI Predictions
In this final step, AI dynamically detects form fields, fills them, and submits the form using Selenium.
File: form_submission.py
import cv2
import numpy as np
import os
import time
from selenium import webdriver
# **STEP 1: Start Selenium WebDriver & Capture Screenshot**
driver = webdriver.Chrome()
driver.get("https://training.qaonlinetraining.com/testPage.php")
time.sleep(3)
# **Capture Screenshot**
screenshot_path = "dataset/form_screenshot.png"
driver.save_screenshot(screenshot_path)
print(f"📸 Screenshot saved: {screenshot_path}")
# **STEP 2: Load Screenshot for AI Processing**
img = cv2.imread(screenshot_path, cv2.IMREAD_GRAYSCALE)
# **Apply Image Preprocessing**
thresh = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 11, 2)
# **Find Contours (Form Elements)**
contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# **Create Directory for Cropped Elements**
cropped_dir = "dataset/cropped"
os.makedirs(cropped_dir, exist_ok=True)
# **Crop Form Elements Dynamically**
cropped_elements = []
for i, contour in enumerate(contours):
x, y, w, h = cv2.boundingRect(contour)
if 10 < w < 400 and 10 < h < 300: # **Detect Small & Large Form Elements**
cropped = img[y:y+h, x:x+w]
cropped_path = os.path.join(cropped_dir, f"element_{i}.png")
cv2.imwrite(cropped_path, cropped)
cropped_elements.append(cropped_path)
print(f"✅ Cropped Element Saved: {cropped_path}")
driver.quit()
import tensorflow as tf
# **STEP 1: Load Trained AI Model**
model = tf.keras.models.load_model("form_detector.h5")
# **Define Categories**
element_types = ["text_field", "email_field", "dropdown", "checkbox", "radio_button", "button"]
# **STEP 2: Process Each Cropped Image for AI Prediction**
detected_elements = []
for cropped_path in cropped_elements:
img = cv2.imread(cropped_path, cv2.IMREAD_GRAYSCALE)
# **Resize Image for AI Model**
img_resized = cv2.resize(img, (64, 64))
img_array = np.array(img_resized).reshape(-1, 64, 64, 1) / 255.0
# **AI Prediction**
prediction = model.predict(img_array)
predicted_class = np.argmax(prediction)
detected_element = element_types[predicted_class]
detected_elements.append(detected_element)
print(f"🔍 AI detected element: {detected_element}")
from selenium.webdriver.common.by import By
import time
# **STEP 1: Start WebDriver**
driver = webdriver.Chrome()
driver.get("https://training.qaonlinetraining.com/testPage.php")
time.sleep(3)
# **Find All Input Elements Dynamically (Using AI Detection)**
all_inputs = driver.find_elements(By.TAG_NAME, "input") # Gets all input elements
# **Track Used Elements to Prevent Duplicates**
used_elements = set()
# **STEP 2: Match AI Predictions with Real HTML Elements**
for detected_element in detected_elements:
if detected_element in used_elements:
continue # **Skip duplicate input**
for input_element in all_inputs:
input_type = input_element.get_attribute("type") # Get input type
# **Text Fields**
if detected_element == "text_field" and input_type == "text":
input_element.send_keys("John Doe")
# **Email Fields**
elif detected_element == "email_field" and input_type == "email":
input_element.send_keys("johndoe@example.com")
# **Checkboxes**
elif detected_element == "checkbox" and input_type == "checkbox":
if not input_element.is_selected():
driver.execute_script("arguments[0].click();", input_element)
# **Radio Buttons**
elif detected_element == "radio_button" and input_type == "radio":
driver.execute_script("arguments[0].click();", input_element)
# **Mark element as used**
used_elements.add(detected_element)
# **STEP 3: Submit Form Using Correct Input Type**
submit_buttons = driver.find_elements(By.TAG_NAME, "input") # Get all inputs
submit_clicked = False # Track submit button click
for button in submit_buttons:
if button.get_attribute("type") == "submit" and not submit_clicked:
button.click()
print("✅ Form Submitted!")
submit_clicked = True # Prevent duplicate submissions
# **STEP 4: Close Browser**
time.sleep(3)
driver.quit()
print("✅ AI-based form submission completed for testPage.php!")
How the form is filled dynamically:
- Captured a fresh screenshot of the form.
- AI processed the image to detect elements dynamically.
- Selenium matched AI predictions with actual form fields.
- Filled in detected fields (text fields, checkboxes, radio buttons, dropdowns).
- Clicked the submit button to complete automation.
Why this approach is powerful?
- No hardcoded locators required
- AI adapts to new forms without code changes
- Can be applied to multiple websites without modification
Now AI fills out forms just like a human!
Key Takeaways
✔ AI-powered automation adapts to any form layout—no hardcoded locators needed.
✔ Form fields are detected dynamically, making scripts future-proof.
✔ This method can be applied to login forms, job applications, survey forms, and more.
Result
Full Code Repository
🔗 GitHub Repository: AI-Powered Form Automation
🚀 Want to automate forms? Train your own AI model today!