Search

์ด๋ฏธ์ง€ ํ…์ŠคํŠธ ์ถ”์ถœ ํ”„๋กœ๊ทธ๋žจ

ํ”„๋กœ๊ทธ๋žจ ์ด๋ฆ„

์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ ํ…์ŠคํŠธ๋ฅผ ์ถ”์ถœํ•˜๋Š” ํ”„๋กœ๊ทธ๋žจ

์„ค์น˜

1.
Visual c++
2.
teseract
3.
pytesseract

Visual c++

teseract

ํ•œ๊ตญ์–ด kor ํŒจ์น˜

tessdata
tesseract-ocr
์„ค์น˜๋œ ๊ฒฝ๋กœ์— kor.traineddata ํŒŒ์ผ์„ ์ €์žฅํ•ด์ฃผ์„ธ์š”!
C:\Program Files\Tesseract-OCR\tessdata
Plain Text
๋ณต์‚ฌ

pytesseract

pip install pytesseract
Python
๋ณต์‚ฌ

ํ”„๋กœ๊ทธ๋žจ

1
2

ํ”„๋กฌํ”„ํŠธ

๊ฐ„๋‹จํ•˜๊ฒŒ ์ด๋ฏธ์ง€์—์„œ ํ…์ŠคํŠธ ์ถ”์ถœํ•˜๋Š” ํŒŒ์ด์ฌ ํ”„๋กœ๊ทธ๋žจ ๋งŒ๋“ค์–ด์ค˜
Plain Text
๋ณต์‚ฌ

์ฝ”๋“œ

import pytesseract from PIL import Image # Tesseract ์‹คํ–‰ ํŒŒ์ผ ๊ฒฝ๋กœ ์„ค์ • pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe' # ์ด๋ฏธ์ง€ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ image = Image.open('text.png') # ์ด๋ฏธ์ง€์—์„œ ํ…์ŠคํŠธ ์ถ”์ถœ (ํ•œ๊ตญ์–ด + ์˜์–ด) text = pytesseract.image_to_string(image, lang='eng+kor') # ์ถ”์ถœ๋œ ํ…์ŠคํŠธ ์ถœ๋ ฅ print(text)
Python
๋ณต์‚ฌ