A system, method and computer program product for identifying spam in an image, including (a) identifying a plurality of contours in the image, the contours corresponding to probable symbols; (b) ignoring contours that are too small or too large; (c) identifying text lines in the image, based on the remaining contours; (d) parsing the text lines into words; (e) ignoring words that are too short or too long from the identified text lines; (f) ignoring text lines that are too short; (g) verifying that the image contains text by comparing a number of pixels of a symbol color within remaining contours to a total number of pixels of the symbol color in the image, and that there is at least one text line after filtration; and (h) if the image contains text, rendering a spam/no spam verdict based on a contour representation of the text that which appears after step (f).

 
Web www.patentalert.com

< Identification and filtration of digital communications

< Method and apparatus for processing network packets

> Collaborative communication platforms

> System and method for identifying text-based SPAM in rasterized images

~ 00602