How I Added Tagging Using PHP and MySQL
On FONTSMACK, I just enabled the ability to view fonts by tag. The tags, I suppose, are more like keywords as there is currently no user intervention to tag fonts with their own keywords. In any case, my implementation is fairly straightforward and I hope this will help others looking at implementing something similar.
The Database
The database for FONTSMACK is straightforward. All fonts are stored in a fonts table. One of those fields in that table stores the tags in a varchar
field. It's a space-delimited list of words. "sans-serif futuristic computer" is a prime example. When displaying it with the font details, I just pull it out of the database and display it on the page as-is.
Outputting the list of tags
Thanks to PHP and its array functions, outputting a unique list of keywords is quite straightforward:
$types = array();
$rs = $db->select("select tags from fonts");
if($db->numRows()>0)
{
while($ftypes = mysql_fetch_array($rs))
{
$types = array_merge($types, explode(" ", $ftypes[0]));
}
$types = array_unique_compact($types);
for($i=0;$i<count($types);$i++)
{
echo '<a href="/fonts?t='.$types[$i].'">'. $types[$i].'</a>';
if($i<count($types)-1) echo ",\n";
}
}
The while
loops through all rows in the table and explodes the tags into an array which gets appended onto a master types array. Once that is done, I strip out all duplicate values and compact the array using a custom PHP function called array_unique_compact
.
function array_unique_compact($a){
$tmparr = array_unique($a);
$i=0;
foreach ($tmparr as $v) {
$newarr[$i] = $v;
$i++;
}
return $newarr;
}
With my compacted and unique array, I loop through each tag in the array and output it to the page.
Searching
To pull a list of items out of the database with a specific tag is now pretty simple. Your WHERE
clause in your SQL should like something like this, where $tag
is the keyword that we are looking for:
..."tags LIKE '% $tag %' OR tags LIKE '$tag %' OR tags LIKE '% $tag' OR tags = '$tag'"...
In SQL, % is a wildcard character. What the first LIKE statement says is to find the keyword with a space on both sides anywhere in the string. The second says to find the keyword at the beginning of the field. Similarly, the third LIKE statement tries to find the keyword at the end of the string. Finally, we check if the tag is exacly equal to what is in the database.
Notes
What I glossed over in this example is data validation but I can't stress it enough: validate your data. It's easy to get lazy and not check the values for things like quotes that shouldn't be there (Yup, I can be lazy, too).
UPDATE: I've added a post on weighting the tag list.
Conversation
I use about the same idea but to make a cleaner SQL call I decided to add spaces (or, in my case, semi-colons) at the start and end of the tag string. Thus, the SQL is "LIKE %;$tag;%".
I therefore only need to strip the first and last element of the array when exploding... In my opinion php is better suited than SQL for this small overhead, maybe I'm wrong.
Hope this helps.
You can simplify the where clause a little:
WHERE concat(' ', tags, ' ') LIKE ' %$tag% '
A minor word of warning: any like clause that starts with a wildcard (LIKE '%...') cannot use indexes.
So if you have 10,000 rows in your table, the database will have to compare ALL 10,000 rows.
Would it be better for your tagging schema to use a many-to-many schema for the tags?
You might want to read this pretty thorough look at various tagging schema.
Great job! Maybe tell us the advantages of this? I'm working with Ruby on Rails now, but tagging seems to be my next project... after my webcalendar and other mini projects...
Richard@Home: I had read that article when it originally came out so I didn't see the update regarding indexing issues using LIKE. Since I'm only dealing with fonts, it would be a long time until I hit 10,000 records.
All things considered, I think I'll use Korbo's suggestion (something I had thought of but disregarded as unnecessary) to minimize the number of LIKE's that are being done. Being able to use spaces is certainly an advantage because I can use the PHP function
trim
to remove the whitespace on either side before exploding it into an array.Jeremy: I would normally recommend using a properly normalized database as it is more flexible in the long term. I fully intended this to be a straightforward solution and didn't need any major scalability. Simplicity was key and being able to store the tags in a single field was definitely easier.
Can you describe the process just a little more please?
Also, it would be cool to see how've you done the weighted tags list?
I'm not sure about the pros and cons of each, but I prefer to use two tables.
1. The fonts table.
2. The tags table
Inside 2, I have 3 columns. First is the record id, then the font id, then the tag. I create a separate row for each tag.
This would add complexity, but you can make your queries faster by using three tables. Plus, it would be easier to manage and get better search results.
1) fonts
2) tags
3) a font-tag transition table.
tags table has only two columns id and name
font-tag table has two columns
font_id
tag_id
That's basically what I said, but more complex.
Marcello: while performance would be faster in doing a search, I'm not sure it results in better search results (unless MySQL would drop matched results). Also, considering that the site will unlikely hit more than a few thousand records, performance seems less of a factor.
Down the road, if I need to go that route, it won't be too hard to create a script that would normalize the databse for me.
Hello Jonathan,
I managed to do almost everything, writing and reading from database, outputting tags, weighted list, but I can't do simpliest thing: *pull a list of items with a specific tag*.
Can you describe that? I did a separate file "tag.php", so, what php code I must enter in it?
?ime: Start with the basic SQL to pull a list of items from the database:
SELECT fieldname FROM tablename
then tack on the LIKE statements from the article as part of the WHERE clause.
SELECT fieldname FROM tablename WHERE tags LIKE '% $tag %' OR tags LIKE '$tag %' OR tags LIKE '% $tag' OR tags = '$tag'
hey, thanks for the cool code. It got me by for the time being until I put all of my tags into their own table.
I had a problem with null members in the array, and at the same time decided to limit the array members to more than 2 characters, here's the small but of code I've struggled over!
...
$tags = array_merge($tags, explode(" ", $ftags[0]));
}
//remove empties
foreach($tags as $key => $value) {
if($value == "" || strlen($value) < 3) {
unset($tags[$key]);
}
}
Thanks! This helped in a pinch. Also, I love your design and the innovative comment placement!
Does anyone bother filtering/ translating/condensing tags? For example, Hyperactive and Hyperactivity might be condensed into one tag. Or, if someone searched on "Run" the results wouldn't include "Running" (when the where clause includes spaces around the tag).
Amy: Data quality is certainly a valid concern and the answer is usually, "No". Most systems that let users do the tagging don't consolidate into known groupings. To avoid these kinds of things, sites like del.icio.us try to offer suggestions so as to steer the user into providing relevant data under the same tags as other sites.
Very good
good site...can n e 1 send me post a comment code?
i've used your tags-script for my brands selectable by industry.
cool, thanks again! :-)
http://www.brandinfection.com/brands/
how do you like it ?
best regards, nader
hey Jonathan,
nice work on the tagging script, i've gotten almost everything to work properly except pulling a list of specific tags
i have a seperate file "display-tags.php" and my database has a table with columns id, title, tags; and tags are comma seperated
this is a link to the display-tags.php source: http://asifr.com/display-tags.txt
hope you can help out, thanks
hey, thanks for the cool code. It got me by for the time being until I put all of my tags into their own table.
I had a problem with null members in the array, and at the same time decided to limit the array members to more than 2 characters, here's the small but of code I've struggled over!
...
$tags = array_merge($tags, explode(" ", $ftags[0]));
}
//remove empties
foreach($tags as $key => $value) {
if($value == "" || strlen($value) unset($tags[$key]);
}
}
pl. send me the solutions for using wildcard char in php
best regards
mirza
It's similar to what I did. But I used ", " to seperate Tags. Thus I'm able to use words containig spaces as tags.
Really cool tutorial. Thx very much. Will use it in the next version of my site!
It sounds like you're working with a small dataset where the difference would be negligible, but for the sake of others working with larger projects: in your array_unique_compact() function, would it be more efficient to use array_values() instead of re-keying the entire array with a foreach statement?
Danny: Yes, actually, array_values() would have been a much better option, even for my small example, since they're doing exactly the same thing (re-keying the array).
Hello,
Will this code work efficiently with large databases? I have a community based web site and I would like to create a tagging system. The problem is that my database will most likely have more than 10,000 records over a short period of time, so I'm concerned if the above code will be able to handle large databases.
Hopefully someone will answer, I know it's an old post.
@s.: it might not be practical but it does depend entirely on how you plan to use the data. It worked for me on this project as I was dealing with a smaller dataset. If you need to do some aggregation, it'd likely be best to store it in its own table.
Thanks for the tutorial... I've been wanting to implement tagging on my site for ages, but have been unable to decide on the best method. Your method is one that I hadn't thought of yet and seems to be the best so far... thanks!
Very interesting solution... and do you know relation tables?
ah non c bon