My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
MySpace  
An overview of the MySpace user page source code.
Updated Feb 4, 2010 by spiderb...@gmail.com

Introduction

This page describes the XHTML structure and contents of a typical MySpace user page, with notes regarding relevant and usable content for scraping and subsequent analysis.

Note: Around July of 2008, MySpace began outputting XHTML instead of their previous HTML output. As a result, I had to update the current working versions I had installed on my system. The only change necessary was in the pattern-matching, changing the termination string from "
" to "

". After that small change, everything worked fine again.

What is scrape-able on a user page without logging in?

Assuming a user's profile is public (see the next section below to compare public vs private), the following information can be gathered from a user's myspace page without needing your application to log in:

  • Last login:
  • Page Title
  • Mood
  • Profile Pic
  • (I will finish this list later.)

Private VS Public User Profiles

I will be discussing the differences between scraping a public user profile and a private user profile in this section. The most informative thing to put here is probably a list of what can't be scraped when a user's profile is set to private.

  • Anything relating to the user's friends
  • (I will finish this list later.)

What is unavailable to applications that aren't logged in?

In the above two sections, you can see the difference between content available in public and private user profiles.

  • All images in a user's photos, except the thumbnail of the image currently set to their profile image.
  • (I will finish this list later.)

Sign in to add a comment
Powered by Google Project Hosting