Every year, museums and collectors spend millions of dollars authenticating paintings to distinguish real masterpieces from sophisticated forgeries. Despite advancements in carbon dating and expert curation, identifying fake paintings remains a significant challenge. This project explores how computer vision can assist curators in analyzing brush strokes to detect potential fraud before a human expert performs the final verification.

By leveraging OpenCV-based image processing techniques, how can one build a painting authentication pipeline capable of performing preliminary sweeps, much like how applicant tracking systems filter resumes before reaching a hiring manager.

Key Research Questions

1️⃣ Can we infer what type of paintbrush was used based on a specific paint stroke?
2️⃣ What is the best way to differentiate two nearly identical brush strokes?
3️⃣ How can image processing techniques improve painting authentication?

To answer these questions, first you need to gather a dataset of paintings and individual brush strokes, analyzed their visual properties, in order to compare brushwork patterns using computer vision techniques.

Dataset Creation & Preprocessing

With every machine learning model, a dataset and pipeline must be first created:
Collect a dataset of historical and modern paintings
Segment individual brush strokes for comparative analysis
Convert images into a standardized format for processing

Using a custom built data set of different paint brush strokes as well as referencing papers such as:

Example of Image Preprocessing

  1. Thresholding: Converts an image into a binary representation (black & white).
  2. Morphological Operations: Enhances structure by applying erosion (removing noise) and dilation (expanding stroke details).
  3. Skeletonization: Reduces a brush stroke to its essential structure for direct comparison.

Key Computer Vision Techniques for Stroke Comparison

1️⃣ Skeletonization: Extracting the Core Structure

To analyze brush strokes effectively, we performed skeletonization, a process that reduces a paint stroke to its thinnest version while preserving its shape.

🔹 Skeletonization Process in OpenCV:

import cv2
import numpy as np

# Read image and convert to grayscale
image = cv2.imread('paint_stroke.jpg', 0)

# Apply thresholding
_, binary = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)

# Apply morphological operations for skeletonization
kernel = np.ones((3,3), np.uint8)
skeleton = cv2.morphologyEx(binary, cv2.MORPH_OPEN, kernel)

# Save and display the skeletonized image
cv2.imwrite('skeletonized_stroke.jpg', skeleton)
cv2.imshow('Skeletonized Stroke', skeleton)
cv2.waitKey(0)
cv2.destroyAllWindows()

2️⃣ Morphological Operations: Enhancing Brush Stroke Features

To refine the images, dilation and erosion were applied:

OperationEffect
DilationExpands the white regions of a stroke, enhancing structure.
ErosionShrinks white regions, removing unwanted noise.

🔹 Example of Dilation & Erosion in OpenCV:

# Apply dilation
dilated = cv2.dilate(binary, kernel, iterations=1)

# Apply erosion
eroded = cv2.erode(binary, kernel, iterations=1)

cv2.imwrite('dilated_stroke.jpg', dilated)
cv2.imwrite('eroded_stroke.jpg', eroded)

3️⃣ Color & Texture Matching for Authentication

While skeletonization focuses on stroke shape, additional comparisons were performed using:

Color Histograms – Ensuring color consistency across genuine and forged paintings.
Texture Matching – Using LBP (Local Binary Patterns) to capture micro-textural details in brushwork.

🔹 Example of Color Histogram Analysis in OpenCV:

# Compute color histogram
hist = cv2.calcHist([image], [0], None, [256], [0, 256])

# Normalize histogram
hist /= hist.sum()

# Compare histograms (Chi-Square method)
comparison_score = cv2.compareHist(hist1, hist2, cv2.HISTCMP_CHISQR)

💡 Why Color & Texture Matching?

  • Helps differentiate paintings created under different lighting conditions.
  • Identifies subtle variations in paint layering and brush pressure.

Generating a Similarity Score for Brush Stroke Matching

The final step was creating a numerical similarity score by combining all processed data:

1️⃣ Skeletonized stroke comparison → Measures shape similarity.
2️⃣ Texture similarity using LBP → Quantifies fine-grained details.
3️⃣ Color histogram matching → Ensures consistency in pigmentation.

Each comparison method was weighted, and a final similarity score was assigned between 0% (no match) and 100% (identical strokes).

Trade-offs & Future Enhancements

Algorithm Trade-offs

MethodStrengthsWeaknesses
SkeletonizationCaptures fine-grained stroke detailsSensitive to noise and artifacts
Color Histogram MatchingRobust against lighting differencesStruggles with faded paintings
LBP Texture AnalysisIdentifies micro-patterns in brushworkRequires high-resolution images

💡 Planned Enhancements
🚀 Use Generative Adversarial Networks (GANs) – Train a model to generate fake paintings and compare against originals.
🚀 Deep Learning for Feature Extraction – Experiment with CNNs (Convolutional Neural Networks) for stroke-based classification.

Post-Deployment Monitoring & Real-World Applications

Once implemented, this system can assist in real-time painting authentication for museums, auction houses, and private collectors.

To maintain accuracy, continuous monitoring would be required:

🔍 Drift Detection – Periodically retraining models with new verified artwork samples.
🔍 False Positive Analysis – Ensuring genuine paintings are not incorrectly flagged.
🔍 Adaptive Learning – Implementing an active learning pipeline where expert curators validate model predictions to refine performance.