Written by Jonathon Marshall (jwm at 1101001 dot com) for importing custom encyclopedia datasets from BMEzine.com into our own Wikipedia.
The code is simple enough that I will just include it here. I'm happy to answer any questions, but it's pretty straight forward.
<?php
#
# Wikipedia Import Script - jwm@1101001.com - PHP & CURL
#
# In my case I was importing a directory full of text files that were pre prepared by Shannon Larratt into the wiki format.
#
# ex. Tattoo_Ink.txt
#
# [[Category: Reference]]
#
# Tattoo 'inks' are the substance that a [[tattoo machine]] places under
# your [[skin]] in order to leave you with a permanent mark. I've put
# inks in quotes because technically speaking it's not really ink —
# it's actually pigment (generally metal salts or even plastics)
# suspended in a carrier solution (which keeps the pigments evenly mixed,
# applicable, and helping keep it clean).
#
$ENCYC_DIR = '/home/jwm/wiki';
# Read File
if ($dh = opendir('/home/jwm/wiki'))
{
while (false !== ($file = readdir($dh)))
{
if ($fh = fopen($ENCYC_DIR.'/'.$file, 'r'))
{
# Page title is the filename minus extension
$ENCYC_TITLE = substr($file,0,-4);
$ENCYC_TEXT = NULL;
while (!feof($fh))
{
$ENCYC_TEXT .= fgets($fh, 4096);
}
fclose($fh);
# Prepare Data
$PostFields = array(
# Datetime of creation (change this to suit you)
'wpStarttime' => '20060412223042',
'wpEdittime' => '20060412223042',
'wpTextbox1' => '$ENCYC_TEXT',
# This value may be different for your site.
# Just grab this value from the wiki forms source.
'wpAutoSummary' => 'd41d8cd98f00b204e9800998ecf8427e'
);
$PostString = NULL;
foreach($PostFields as $key=>$value)
$PostString .= '$key='.urlencode($value).'&';
# Remove the trailing '&' from our post string
# - sloppy but effective
$PostString = substr($PostString,0,-1);
# Post Data
$PostURL = 'http://yourdomain.com/index.php?title='.$ENCYC_TITLE.'&action=edit';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $PostURL);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $PostString);
curl_exec($ch);
curl_close($ch);
}
}
closedir($dh);
}
?>
The code is simple enough that I will just include it here. I'm happy to answer any questions, but it's pretty straight forward.
<?php
#
# Wikipedia Import Script - jwm@1101001.com - PHP & CURL
#
# In my case I was importing a directory full of text files that were pre prepared by Shannon Larratt into the wiki format.
#
# ex. Tattoo_Ink.txt
#
# [[Category: Reference]]
#
# Tattoo 'inks' are the substance that a [[tattoo machine]] places under
# your [[skin]] in order to leave you with a permanent mark. I've put
# inks in quotes because technically speaking it's not really ink —
# it's actually pigment (generally metal salts or even plastics)
# suspended in a carrier solution (which keeps the pigments evenly mixed,
# applicable, and helping keep it clean).
#
$ENCYC_DIR = '/home/jwm/wiki';
# Read File
if ($dh = opendir('/home/jwm/wiki'))
{
while (false !== ($file = readdir($dh)))
{
if ($fh = fopen($ENCYC_DIR.'/'.$file, 'r'))
{
# Page title is the filename minus extension
$ENCYC_TITLE = substr($file,0,-4);
$ENCYC_TEXT = NULL;
while (!feof($fh))
{
$ENCYC_TEXT .= fgets($fh, 4096);
}
fclose($fh);
# Prepare Data
$PostFields = array(
# Datetime of creation (change this to suit you)
'wpStarttime' => '20060412223042',
'wpEdittime' => '20060412223042',
'wpTextbox1' => '$ENCYC_TEXT',
# This value may be different for your site.
# Just grab this value from the wiki forms source.
'wpAutoSummary' => 'd41d8cd98f00b204e9800998ecf8427e'
);
$PostString = NULL;
foreach($PostFields as $key=>$value)
$PostString .= '$key='.urlencode($value).'&';
# Remove the trailing '&' from our post string
# - sloppy but effective
$PostString = substr($PostString,0,-1);
# Post Data
$PostURL = 'http://yourdomain.com/index.php?title='.$ENCYC_TITLE.'&action=edit';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $PostURL);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $PostString);
curl_exec($ch);
curl_close($ch);
}
}
closedir($dh);
}
?>