pip install doctable
doctable consists of several classes for working with text data:
This demonstration shows a typical workflow using DocTable and DocParser. We use DocParser to download, tokenize, and create parsetrees for 17 national security strategy documents. We take advantage of DocParser to create a parallelized workflow for parsing, and a custom DocTable class for efficient storage.See Intro to DocTable Example »
Class for distributing text processing tasks across multiple processes for insertion into databases. Works similar to multiprocessing.Pool() but handles pipes differently for larger data passing and chunk-level processing.