My favorites | Sign in
Project Logo
                
Code license: New BSD License
Labels: PHP, Wikipedia, Wiki, Import, CURL
Links:
Feeds:
People details
Project owners:
  jonathon.marshall
Written by Jonathon Marshall (jwm at 1101001 dot com) for importing custom encyclopedia datasets from BMEzine.com into our own Wikipedia.

The code is simple enough that I will just include it here. I'm happy to answer any questions, but it's pretty straight forward.

<?php
#
# Wikipedia Import Script - jwm@1101001.com - PHP & CURL
#
# In my case I was importing a directory full of text files that were pre prepared by Shannon Larratt into the wiki format.
#
# ex. Tattoo_Ink.txt
#
# [[Category: Reference]]
#
# Tattoo 'inks' are the substance that a [[tattoo machine]] places under
# your [[skin]] in order to leave you with a permanent mark. I've put
# inks in quotes because technically speaking it's not really ink —
# it's actually pigment (generally metal salts or even plastics)
# suspended in a carrier solution (which keeps the pigments evenly mixed,
# applicable, and helping keep it clean).
#

$ENCYC_DIR = '/home/jwm/wiki';

# Read File

if ($dh = opendir('/home/jwm/wiki'))
{
    while (false !== ($file = readdir($dh)))
    {
        if ($fh = fopen($ENCYC_DIR.'/'.$file, 'r'))
        {
            # Page title is the filename minus extension
            $ENCYC_TITLE = substr($file,0,-4);
            $ENCYC_TEXT = NULL;


            while (!feof($fh))
            {
                $ENCYC_TEXT .= fgets($fh, 4096);
            }

            fclose($fh);

            # Prepare Data

            $PostFields = array(
                # Datetime of creation (change this to suit you)
                'wpStarttime' => '20060412223042',
                'wpEdittime' => '20060412223042',
                'wpTextbox1' => '$ENCYC_TEXT',
                # This value may be different for your site.
                # Just grab this value from the wiki forms source.
                'wpAutoSummary' => 'd41d8cd98f00b204e9800998ecf8427e'
            );

            $PostString = NULL;
            foreach($PostFields as $key=>$value)
                $PostString .= '$key='.urlencode($value).'&';

            # Remove the trailing '&' from our post string
            # - sloppy but effective
            $PostString = substr($PostString,0,-1);

            # Post Data

            $PostURL = 'http://yourdomain.com/index.php?title='.$ENCYC_TITLE.'&action=edit';

            $ch = curl_init();
                curl_setopt($ch, CURLOPT_URL, $PostURL);
                curl_setopt($ch, CURLOPT_POST, 1);
                curl_setopt($ch, CURLOPT_POSTFIELDS, $PostString);
                curl_exec($ch);
            curl_close($ch);
        }
    }
    closedir($dh);
}
?>











Hosted by Google Code