Thu, September 18, 2025
4 min read
Prototype Development of a PDF to Text Extraction and Image Conversion CLI Tool with Node.js and WASM Libraries
#nodejs
#cli
#pdf
#wasm
I want to extract text and images from PDFs for use in vectorization for RAG. With this background, I developed a prototype CLI tool that's easy to use with Node.js. The key point is leveraging WebAssembly (Wasm) for use in serverless environments. This article introduces the background of our technology selection and an overview of the implementation.
read more →