| Issue 20: | Awesome imdb integration! | |
| 1 person starred this issue and may be notified of changes. | Back to list |
It is possible to grab some film information from imdb! How? Let me explain... IMDB WEB SCRAPER (Open Source) http://lab.abhinayrathore.com/imdb/imdbWebService.php?m=Titanic&o=xml EXAMPLE CODE # Grab IMDB information table # $ FILM is 'FILMNAME (YEAR)' wget -o imdb.xml http://lab.abhinayrathore.com/imdb/imdbWebService.php?m=$FILM&o=xml # Grab Film Title (Awesome for PCH!!) # Many languages avayable! # Popcorn users will LOVE this! nmt jukebox requires english title only!! # With this scraper you could search using ex. french/italian/spanish film name, and you can get english film name, required by Jukebox! TITLE=`xpath -q -e '//ALSO_KNOWN_AS' imdb.php | grep Italy | sed 's/<[^>]*>//g' | sed 's/=.*//g' | sed 's/&#x27;/'"'"'/g'`; # Grab Film year YEAR=`xpath -q -e '//YEAR' imdb.php | sed 's/<[^>]*>//g'`; # Grab Film POSTER (Awesome!!) # Many Poster Avayable! POSTER=`xpath -q -e '//POSTER_SMALL' imdb.php | sed 's/<[^>]*>//g'`; wget $POSTER # Grab Rating (Awesome!!) RATING=`xpath -q -e '//RATING' imdb.php | sed 's/<[^>]*>//g'`; # Grab imdb title (tt0120338) # Useful for Popcorn Hour!!! IMDB=`xpath -q -e '//TITLE_ID' imdb.php | sed 's/<[^>]*>//g'`;
Nov 14, 2011
Project Member
#1
login...@gmail.com
Nov 15, 2011
Quick & (Absolutely) Dirty implementation Needs nfo destination folder at line 610.
Nov 15, 2011
Thanks for your input and really sorry for not getting back to you earlier. I'll definitely look into it next week-end. I'll need to look into wget and make sure there is a timeout that can be configured. I'll also need to make sure wget is installed and input its path. Also one thing I'll have to look into is that http://labaia.hellospace.net/imdbWebService.php website. It's amazing what it is able to spit out. I'll also need to make sure the rest of the script won't choke on those nfo files. Thanks for your help
Nov 16, 2011
Scraper file n.1
Nov 16, 2011
Scraper file n.2
Nov 16, 2011
New version of modded scripts. Added Poster download, some minor update. I don't know if PCH support xpath. I doubt.
Nov 16, 2011
Thanks I browsed through all the script and here's what I plan on doing in terms of IMDB integration. Let me know what you think of it. I won't start coding until I'm sure I haven't forgotten anything... also I only have time to work on it during weekends... - add the imdb options in the script parameter - add this new parameter to the settings.ini file - Detect wget path - Save movie title / series title in a new variable (X) and file name after it has been renamed in another variable (Y) : -> If multiple video file with a movie pattern / series pattern in the surrounding folder, retain the name of the folder in its imdb perfect match format (remove season.* from the end of the name if it's a series pack) -> If only one video file with a movie pattern, retain the name of the file without its extension in its perfect match format - Make sure http://labaia.hellospace.net/imdbWebService.php is up and running - Add the lines you kindly supplied in a new subroutine right before the "Convert DTS track..." routine. If lookup fails, the script has to be able to recover. Also : -> Title stored in variable X will be fetched for NFO and JPG -> Add nfo and jpg extensions to the movies_extensions_rev variable -> If single file movie, and and NFO/JPG files downloaded (count files) name those NFO and JPG files as variable_Y.extension and : - put all these files (movie included) in a new folder named variable_Y - rewrite $log_files with those files and the new folder -> If multi files movie / series pack, fetch variable X for NFO / JPG and store / name them as the files included in the surrounding folder (there will then be movie_part_1.nfo, movie_part_1.jpg, movie_part_2.nfo, movie_part_2.jpg + movie_part_1.avi and movie_part_2.avi in this folder... maybe much more for a series pack) and list all that in the $log_files file. And of course credit you in the script to thank you for your help :-) Thanks again for your interest in torrentexpander
Nov 17, 2011
1) Yes, add imdb options and settings. I suggest "produce_imdb_nfo", "download film poster", "poster_format". The "poster_format" variables can be 'normal, large, small, full'. 2) Of course. 3) Ok, but .. what about xpath, are you able to replace its function? ;-) 4) Example is required. ( ?? ) 5) Of course. 6) I think the most important thing is "let users choose what to do". If an user want to rename files using 'type_3' rules, but it want to produce .nfo and .jpg of the movie, then he can do it, even 'type_1' rules is necessary to get correct imdb information. The imdb implementation must be separate from renaming script, but the film name must match the .nfo & .jpg filename. Users may decide to not rename files, but he may want to use imdb.
Nov 18, 2011
This is my concept map. Please have a look. I'm not sure if all is correct and meets torrentexpander futures. Feel free to edit the concept map, we could use for documentation. Thanks to program best automatic rename tool of the world.
Nov 18, 2011
Hi
Thank you for all the time you spent improving torrentexpander.
I finally took time to give your imdb integration routine a try.
wget is not always installed by default (for example on Mac OS X), so I tried curl instead.
Depending on which one is installed, I'll automatically switch to the right one.
I kinda improved some lines by storing the xml in a variable and dropped xpath dependencies.
I'll spend time on imdb integration and your concept map this week-end.
Thanks for your help.
PS: I'm no programer and I started writing my first lines of code when I started torrentexpander not so long ago, so it's nice to know you like it.
Take a look at the rewriting :
# IMDB integration
nfo_file=`echo "$title_clean_ter_other_pat".nfo`;
poster=`echo "$title_clean_ter_other_pat".jpg`;
xml_cont="$(curl -i "http://labaia.hellospace.net/imdbWebService.php?m=$title_clean_ter_other_pat&o=xml")"
wait
imdb_url=`echo "$(echo $xml_cont | egrep -o "<IMDB_URL>.*</IMDB_URL>" | sed -e 's;\(<IMDB_URL>\)\(.*\)\(</IMDB_URL>\);\2;')"`;
poster_url=`echo "$(echo $xml_cont | egrep -o "<POSTER>.*</POSTER>" | sed -e 's;\(<POSTER>\)\(.*\)\(</POSTER>\);\2;')"`;
if [ "$imdb_url" != "" ]; then
step_number=$(( $step_number + 1 ))
echo "Step $step_number : Building .nfo";
echo "$imdb_url" > "$destination_folder/$nfo_file";
fi
if [ "$poster_url" != "" ]; then
step_number=$(( $step_number + 1 ))
echo "Step $step_number : Downloading Poster";
curl -o "$destination_folder/$poster" "$poster_url";
# wget -q -O "$destination_folder/$poster" $poster_url;
wait
fi
Nov 18, 2011
Excellent! I'm studing a way to get fanart images using this: http://api.themoviedb.org/2.1/methods/Movie.getImages Preparing for this future, i suggest to grab TITLE_ID. title_id=`echo "$(echo $xml_cont | egrep -o "<TITLE_ID>.*</TITLE_ID>" | sed -e 's;\(<TITLE_ID>\)\(.*\)\(</TITLE_ID>\);\2;')"`;
Nov 18, 2011
Very simple! fanart=`echo "$title_clean_ter_other_pat".fanart.jpg`; wget/curl http://api.themoviedb.org/2.1/Movie.getImages/en/xml/57983e31fb435df4df77afb854740ea9/$title_id then grab the url of random backdrop imgage in size $fanart_size // user choose depending tv wget -q -O "$destination_folder/$fanart" $fanart_url;
Nov 19, 2011
Hi Loginbug I just created a Torrentexpander 101 wiki page to help you understand the basic structure of torrentexpander https://code.google.com/p/torrentexpander/wiki/Torrentexpander_in_depth?ts=1321742965&updated=Torrentexpander_in_depth Your idea of maintaining a concept map is great, but due to the length of the script, we'll need to use a modeling software. Torrentexpander is only 800 lines long but it is already fairly complex. I only started this project a few months ago and I am already losing track of what line does what and why it does it. Right now, I'm reviewing the whole script in order to refresh my memory and be more efficient while adding the imdb functionality.
Nov 19, 2011
Thanks, i will read. Anyway, i found a bug in your imdb script, on command curl. That's the correct way: # curl function dislike spacing; replace spaces with + title_clean_ter_other_pat_nospace=`echo $title_clean_ter_other_pat | sed 's/\ /\+/g'`; xml_cont="$(curl -i "http://labaia.hellospace.net/imdbWebService.php?m=$title_clean_ter_other_pat_nospace&o=xml")"
Nov 20, 2011
The last, working version of modded script. NFO + POSTER + FANART Avaiable I prefer to use grep commmand insted of egrep ad xml files insted of varibles
Nov 20, 2011
Check out SVR release r81 IMDB is now integrated I still have issue with curl not setting mime type for images Also, I commented out fanart lines because I haven't had enough time to make it work You need to enable this at the beginning of the script or in your settings.ini file: imdb_poster="yes" imdb_poster_format="normal" imdb_nfo="yes" imdb_fanart="yes" imdb_fanart_format="w1280"
Nov 21, 2011
Good, but imdb plugin should work even if 'clean_filename'=no
Nov 21, 2011
I have seen that script are not able to rename files ( ...CD1.avi & ...CD2.avi ) inside a folder (renamed correctly).
Nov 22, 2011
Regarding comment 20 : SVN release r83 doesn't require clean_filename to be turned on for IMDB routine to work. Regarding comment 21 : long ago, I decided not to rename files if several files are found in a torrent. There are too many patterns (CD1/CD2, moviea/movieb, movie1/movie2, moviepart1/moviepart2, and so on) Also, what happens if the torrent contains TV Episodes, Subtitles (especially idx/sub)... Renaming files from a multi files torrent would be really likely to fuck up, trust me on that ;-) Once I'm done adding fanarts and making sure no nfo/jpg is generated for non movie files (set, idx, sub subtitles), I'll ask you to test it thoroughly and confirm me it works fine - for now everything seems OK. Thanks again
Nov 23, 2011
Thanks. Yes, i trust you.
Nov 23, 2011
If destination directory already exist, program stop itself: 'destination folder is not empty' I think the program should continue, putting the files inside it (only if filename is NOT the same). Example Suppose that you have download two version of the same Film 1) First version It's a folder named /Avatar.2009.Xvid-MYDAD/ --> Avatar.2009.Xvid.CD1-MYDAD.avi --> Avatar.2009.Xvid.CD2-MYDAD.avi 2) Second version It's a folder named /Avatar.2009.Xvid-MYMUM/ --> Avatar.2009.Xvid.CD1-MYMUM.avi --> Avatar.2009.Xvid.CD2-MYMUM.avi After run torrentexpender for both files with 'type_1' schema, i should get: Folder: Avatar (2009) --> Avatar.2009.Xvid.CD1-MYDAD.avi --> Avatar.2009.Xvid.CD1-MYDAD.nfo --> Avatar.2009.Xvid.CD1-MYDAD.jpg --> Avatar.2009.Xvid.CD2-MYDAD.avi --> Avatar.2009.Xvid.CD2-MYDAD.nfo --> Avatar.2009.Xvid.CD2-MYDAD.jpg --> Avatar.2009.Xvid.CD1-MYMUM.avi --> Avatar.2009.Xvid.CD1-MYMUM.nfo --> Avatar.2009.Xvid.CD1-MYMUM.jpg --> Avatar.2009.Xvid.CD2-MYMUM.avi --> Avatar.2009.Xvid.CD2-MYMUM.nfo --> Avatar.2009.Xvid.CD2-MYMUM.jpg I think this is a good job. It's ordered. If destination file already exist, damn! Is it possible to rename folder only? if /Avatar (2009)/ exists then new folder could be /Avatar (2009) [1]/
Nov 23, 2011
It is necessary to add some code to avoid creation of empty file. if [ "$imdb_poster" == "yes" && "$poster_url != "" ]; then "$wget_curl" -q "$poster_url" -O "$temp_folder_without_slash/temp_poster"; wait; fi I suggesto to you to use xml files insted of xml variables for debuggin reason. It will very nice if torrentexpander had --debug option that (for ex. debug mode mantain imdb.xml and themoviedb.xml files)
Nov 25, 2011
Torrentepander better IMDB TMDB plugin + Do no create empty file + Debug support ( I really need ) - Only wget at moment
Nov 26, 2011
Thank you for this I made some minor changes to your code and included it to the last SVN I'm sticking with variables instead of xml files, but added some more information to the debug log. Also, I improved the rename routine so that determining the IMDB title works faster. I couldn't get fanart to work. xml looks like that : 1 3 true false en Adaptation. Adaptation. The Orchid Thief movie 2757 tt0268126 http://www.themoviedb.org/movie/2757 Charlie Kaufman (Cage) writes the way he lives, with great difficulty. His twin brother Donald (also Cage) lives the way he writes, with foolish abandon. Susan (Streep) writes about life, but can't live it. John's (Cooper) life is a book, waiting to be adapted. One story. Four lives. A million ways it can end. 19 8.0 R 2002-06-12 114 1228 2011-11-26 14:57:49 UTC This is what it is supposed to look like : http://api.themoviedb.org/2.1/methods/Movie.imdbLookup I'm
Nov 27, 2011
The line that's already in the script should work and doesn't rely on new commands like tr The problem is that none of the xml downloaded from tmbd contains any backdrop... while the the movies in question obviously have backdrops. XML looks like that : 1 3 true false en Adaptation. Adaptation. The Orchid Thief movie 2757 tt0268126 http://www.themoviedb.org/movie/2757 Charlie Kaufman (Cage) writes the way he lives, with great difficulty. His twin brother Donald (also Cage) lives the way he writes, with foolish abandon. Susan (Streep) writes about life, but can't live it. John's (Cooper) life is a book, waiting to be adapted. One story. Four lives. A million ways it can end. 19 8.0 R 2002-06-12 114 1228 2011-11-26 14:57:49 UTC On the website, there are about 10 backdrops : http://www.themoviedb.org/movie/2757-adaptation
Nov 27, 2011
This is my xml file downloaded with sample script.
Nov 27, 2011
This sample script works well for me!
Nov 27, 2011
Thank you Fanart now works in latest SVN build TMDB servers must have fucked up yesterday evening
Nov 27, 2011
Switched it to enhancement
Labels:
-Type-Defect Type-Enhancement
Dec 4, 2011
(No comment was entered for this change.)
Status:
Fixed
|