
Every year, museums and collectors spend millions of dollars authenticating paintings to distinguish real masterpieces from sophisticated forgeries. Despite advancements in carbon dating and expert curation, identifying fake paintings remains a significant challenge. This project explores how computer vision can assist curators in analyzing brush strokes to detect potential fraud before a human expert performs the final verification.
By leveraging OpenCV-based image processing techniques, how can one build a painting authentication pipeline capable of performing preliminary sweeps, much like how applicant tracking systems filter resumes before reaching a hiring manager.
Key Research Questions
1️⃣ Can we infer what type of paintbrush was used based on a specific paint stroke?
2️⃣ What is the best way to differentiate two nearly identical brush strokes?
3️⃣ How can image processing techniques improve painting authentication?
To answer these questions, first you need to gather a dataset of paintings and individual brush strokes, analyzed their visual properties, in order to compare brushwork patterns using computer vision techniques.
Dataset Creation & Preprocessing
With every machine learning model, a dataset and pipeline must be first created:
✅ Collect a dataset of historical and modern paintings
✅ Segment individual brush strokes for comparative analysis
✅ Convert images into a standardized format for processing
Using a custom built data set of different paint brush strokes as well as referencing papers such as:
Example of Image Preprocessing
- Thresholding: Converts an image into a binary representation (black & white).
- Morphological Operations: Enhances structure by applying erosion (removing noise) and dilation (expanding stroke details).
- Skeletonization: Reduces a brush stroke to its essential structure for direct comparison.
Key Computer Vision Techniques for Stroke Comparison
1️⃣ Skeletonization: Extracting the Core Structure
To analyze brush strokes effectively, we performed skeletonization, a process that reduces a paint stroke to its thinnest version while preserving its shape.
🔹 Skeletonization Process in OpenCV:
import cv2
import numpy as np
# Read image and convert to grayscale
image = cv2.imread('paint_stroke.jpg', 0)
# Apply thresholding
_, binary = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)
# Apply morphological operations for skeletonization
kernel = np.ones((3,3), np.uint8)
skeleton = cv2.morphologyEx(binary, cv2.MORPH_OPEN, kernel)
# Save and display the skeletonized image
cv2.imwrite('skeletonized_stroke.jpg', skeleton)
cv2.imshow('Skeletonized Stroke', skeleton)
cv2.waitKey(0)
cv2.destroyAllWindows()


2️⃣ Morphological Operations: Enhancing Brush Stroke Features
To refine the images, dilation and erosion were applied:
| Operation | Effect |
|---|---|
| Dilation | Expands the white regions of a stroke, enhancing structure. |
| Erosion | Shrinks white regions, removing unwanted noise. |
🔹 Example of Dilation & Erosion in OpenCV:
# Apply dilation
dilated = cv2.dilate(binary, kernel, iterations=1)
# Apply erosion
eroded = cv2.erode(binary, kernel, iterations=1)
cv2.imwrite('dilated_stroke.jpg', dilated)
cv2.imwrite('eroded_stroke.jpg', eroded)


3️⃣ Color & Texture Matching for Authentication
While skeletonization focuses on stroke shape, additional comparisons were performed using:
✅ Color Histograms – Ensuring color consistency across genuine and forged paintings.
✅ Texture Matching – Using LBP (Local Binary Patterns) to capture micro-textural details in brushwork.
🔹 Example of Color Histogram Analysis in OpenCV:
# Compute color histogram
hist = cv2.calcHist([image], [0], None, [256], [0, 256])
# Normalize histogram
hist /= hist.sum()
# Compare histograms (Chi-Square method)
comparison_score = cv2.compareHist(hist1, hist2, cv2.HISTCMP_CHISQR)
💡 Why Color & Texture Matching?
- Helps differentiate paintings created under different lighting conditions.
- Identifies subtle variations in paint layering and brush pressure.
Generating a Similarity Score for Brush Stroke Matching
The final step was creating a numerical similarity score by combining all processed data:
1️⃣ Skeletonized stroke comparison → Measures shape similarity.
2️⃣ Texture similarity using LBP → Quantifies fine-grained details.
3️⃣ Color histogram matching → Ensures consistency in pigmentation.
Each comparison method was weighted, and a final similarity score was assigned between 0% (no match) and 100% (identical strokes).
Trade-offs & Future Enhancements
Algorithm Trade-offs
| Method | Strengths | Weaknesses |
|---|---|---|
| Skeletonization | Captures fine-grained stroke details | Sensitive to noise and artifacts |
| Color Histogram Matching | Robust against lighting differences | Struggles with faded paintings |
| LBP Texture Analysis | Identifies micro-patterns in brushwork | Requires high-resolution images |
💡 Planned Enhancements
🚀 Use Generative Adversarial Networks (GANs) – Train a model to generate fake paintings and compare against originals.
🚀 Deep Learning for Feature Extraction – Experiment with CNNs (Convolutional Neural Networks) for stroke-based classification.
Post-Deployment Monitoring & Real-World Applications
Once implemented, this system can assist in real-time painting authentication for museums, auction houses, and private collectors.
To maintain accuracy, continuous monitoring would be required:
🔍 Drift Detection – Periodically retraining models with new verified artwork samples.
🔍 False Positive Analysis – Ensuring genuine paintings are not incorrectly flagged.
🔍 Adaptive Learning – Implementing an active learning pipeline where expert curators validate model predictions to refine performance.



