PDF extraction hacks
#| label: pdf-2-png
#| fig-cap: "convert pdf to png"
from pdf2image import convert_from_path
pdf_path='in.pdf'
# Store Pdf with convert_from_path function
images = convert_from_path(pdf_path)
for i in range(len(images)):
# Save pages as images in the pdf
images[i].save('page'+ str(i) +'.png', 'PNG')Citation
BibTeX citation:
@online{bochman2022,
author = {Bochman, Oren},
title = {PDF Extraction Hacks},
date = {2022-04-10},
url = {https://orenbochman.github.io/posts/2020/04-10-pdf-extraction/},
langid = {en}
}
For attribution, please cite this work as:
Bochman, Oren. 2022. “PDF Extraction Hacks.” April 10. https://orenbochman.github.io/posts/2020/04-10-pdf-extraction/.