My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
DatabaseStructure  
Proposed DB schema for YOURLS 1.6
Phase-Design
Updated Aug 25, 2010 by ozh...@gmail.com

DB in YOURLS 1.5 and prior

It's weak, performance and disk space wise, and does not allow to easily add new features. Need to improve.

Proposed DB schema for YOURLS 1.6

(See RoadMap)

Legend:

  • PK primary key
  • U unique
  • K index

Feedback?

Please comment below :)

Comment by robvoll...@gmail.com, Mar 30, 2010

> Will change this only if I find a very simple way to convert timestamps into datetime

What about this? http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_from-unixtime

Comment by project member ozh...@gmail.com, Aug 25, 2010

New DB schema. Deleted old comments that were applicable to previous schema (thanks a lot for feedback, peeps)

Comment by asha...@gmail.com, Sep 6, 2010

Hi,

1. The SHORTURL_META and TAG tables have a very strange relationship. How to extract tag values from it or how to extract short urls by tag?

I am not sure if different meta types are needed at all. If needed then it has to be in different table, e.g. META_TYPES.

I would suggest to add a field 'note' to the SHORTURL table. Rename and reorganize SHORTURL_META to SHORTURL_TAGS:

id shorturl_id tag_id

2. What is the purpose of the 'user_id' field in the SHORTURL table?

3. What if the referrer_id is unknown? Not sure... Add IP to the LOG table.

4. Add 'blacklisted' (yes/no) field to the URL table.

Thank you.

Comment by project member ozh...@gmail.com, Sep 6, 2010

Ashapka:

Re 1) The "note" or any meta data attached to a shorturl will be stored in SHORTURL_META. I don't think for example that notes will used by everybody, so it doesn't deserve a column on its own. Different meta type are definitely needed, they will allow very extensible plugin. For instance you will save a short url and state that it's reserved to only one user agent, or state that it should not redirect longer than 1 month. "tag_id" and "note" will be standard meta values but anything will be storable there.

Re 2) maybe user management, see RoadMap

Re 3)I don't see the value of storing individual IP?

Re 4) No, but that is for instance another perfect meta value

Comment by asha...@gmail.com, Sep 7, 2010

Hi Ozh,

Thanks for the response.

Re 2) OK

Re 3) For example, it can be used as an anchor for the future "DeepUserStats?" plugin. :)

Re 1) & 4) Disagree.

The 'note' column belongs to the SHORTURL table even if only 1% of users are going to use it. NULL value does not take any space or slows the system.

The 'blacklisted' column belongs to the URL table, not the SHORTURL table because we blacklist a 'longurl', not a short version (or even versions) of it. In case of implementation, this field will be used in 100% calls.

You don't need meta types for new plugins. New plugins will use new tables or new columns. It's simple to install (attach) or uninstall. Don't have to worry about hardcoded meta type values.

For example:

Current state with tags without SHORTURL_META table:

SHORTURL_TAGS table:

id shorturl_id tag_id

New plugin "Categories":

Two new tables: (SHORTURL_CATEGORIES or URL_CATEGORIES) and CATEGORIES.

SHORTURL_CATEGORIES structure:

id shorturl_id category_id

CATEGORIES structure:

category_id category

New plugin "Blacklist":

Add new column 'blacklisted' to the URL table.

My point is: Current SHORTURL_META approach will slow the system and complicate code significantly. It will be hard to attach new plugins because 'meta_type' values have to be hardcoded in the database and the code. It will be hard to maintain or read code after a few plugins. A database should not be designed this way.

Regards, Alex.

Comment by brett.pr...@gmail.com, Sep 24, 2010

A few comments (and a question):

  1. This is a much improved design over the current two tables.
  2. I agree with Alex that logging referrer IPs would be useful. You're logging the country code, why not log the IP and maybe even hostname? Storage is cheap and it would open up interesting site-wide stat possibilities. (This referrer passes clicks most often on Thursdays.)
  3. Metadata that are called often can always been pulled out into columns if it affects performance. I don't think there's a reason to start with a long shorturl table of "they might use this" columns without any profiling data. Premature optimization is bad.
  4. I strongly disagree with Alex that expecting plugins to add their own tables or (even worse) add columns to core tables will help with readability either for the code or database.
    1. There's no reason meta_type needs to be hardcoded in the database or in core code. Plugins adding meta_types should be expected to keep track of them.
    2. A join isn't difficult to read in code. I'd expect helper APIs available for plugins to pull shorturls by arbitrary meta_type and optionally values, so the code is clean for plugins regardless of db schema.
    3. Allowing plugins to add columns could be catastrophic for upgrades that require changing the database structure.
  5. Why are url.longurl and shorturl.shorturl BINARY?
Comment by project member ozh...@gmail.com, Sep 25, 2010

Re: longurl & shorturl as BINARY: to allow case sensitive search. But maybe there's a better way, I'm mostly an SQL noob.

Re: logging: what I have currently in mind (and which is not decided yet by any means) is something like this: - log stuff for 24 hours, in LOG: shorturl, hit time, referrer, country - then, every 24 hours, rotate this in archive tables, to avoid having gigantic tables that don't scale (several issues about this currently) to keep only a daily recap of hits, and a total recap of countries and referrers.

I think 99.9% of people don't need to log every single IP, so that would be plugin territory instead of core.

But, again, this area is definitely not certain. Some tables are certain (meta for instance) but I'm quite shaky regarding the whole stat area.

Comment by peter.sn...@gmail.com, Nov 13, 2010

Hi just two remarks: 1 Great work! Looks great (except for one thing, see 2) 2 Please also include the plugin that allows period (.) in the URL (I copied the hyphen plugin to get my URL structure working) I use the .flckr and .blg and .boo suffixes to show what kind of url is to be expected.

Comment by michaelr...@gmail.com, Mar 15, 2011

Quick question: Why in the revised schema is there a click_count field in shorturl? This seems counter-intuitive, as it prevents the DB from being fully normalised. Would it not be better to have this value worked out from a search of the log table (i.e. "SELECT count(shorturl_id) FROM log WHERE shorturl_id = $id GROUP BY shorturl_id;")? I ask because I have already implemented something like this in my current installation of 1.5, as I have found that I have at times wanted to directly edit the log table, and having a separate field to update somewhere else has caused problems.

Comment by goo...@nickb.users.panix.com, Apr 29, 2011

Ozh a few questions and thoughts (and I'm even more of an SQL noob than you, so take this with a grain or two of salt..)

I'm presuming the SHORTURL_META table would allow folks like me to push in a data point signifying "tweeted at date/time" to allow folks to do analysis on when the best time to tweet a link is, or how soon after tweeting they get the traffic. (although how to make an AJAX call back from a plugin is one of those things I haven't figured out yet...)

Also, I'd like to keep the IP address in my database for long term, but I'm also really low on click volume. Perhaps implementing this as a configurable option on when to roll the data up into the summary table? (e.g. Roll up after 24 hours, 1 week, 1 month, 1 year etc..) This'd allow people to pick a period that works for them. Although the argument against this is that this info is sometimes stored in their web server log. (although not in mine.. :-/)

The other argument against doing this is that

Comment by mybsi...@gmail.com, Jan 8, 2012

Yes, please add referrer IP address. Would be great for advanced stats.

Comment by cachepla...@gmail.com, Today (8 hours ago)

Another no00b here. For the first time, I have just installed it - love your work!

I too would like to vote for 2 extra columns in the database, namely The SHORTURL_META and TAG tables

When creating a short URL, I'd like to write some short notes lie "blog ad on front page" or "link in email" or "link on page 2 of blog" or "facebook link" etc

I'd also like to be able to create tags for the links, like Evernote and Delicious do, so I can tag links like "email link" or twitter only links" or "software" etc and have a searchable tag field

... and yes, they deserve a column of their own

As an aside, as I am new to this, how do I do a full database backup - CRON job, or something else thru cPanel? Help would be appreciated here!

Thanks!


Sign in to add a comment
Powered by Google Project Hosting