Tesseract OCR can be fully trained to recognize new languages and scripts. A set of files for, community made, trained languages are available as separate packages per language.