PROGRAMS BY wikipedia2xml.sf.net
-
wikipedia2XML Free
A collection of python scripts to create and handle an XML corpus (a large collection of text for linguistic purpose) from an original Wikipedia database backup dump. It includes a regular expression based parser for the Medi