New In my Reading List: Readburner

posted 02:54PM Jan 15, 2008 with tags aggregation atom google rss by Lars Trieloff

Readburner aggregates the most popular items shared by users of Google Reader and offers feeds for popular items, item popular in the last week and upcoming items. So far only a small number of contributors have signed-up (it is opt-in, so it will not aggregate your feed per default), but it will be interesting to see how the whole project evolves.

Fighting Wiki SPAM

posted 09:57AM Jan 07, 2008 with tags google softwaredevelopment spam transformers wiki by Lars Trieloff

Social Software is software that gets spammed. This applies first and foremost to e-mail, but Wikis and Blogs are also preferred targets of wiki spammers. The following rules should act as a guideline for everyone who designs Wiki software, evaluates Wiki software or needs to configure a Wiki that is under attack by spammers.
  1. Understand the way spammers think and work: The main goal of most wiki spammers to to create link spam that will lead search engine crawlers and algorithms, especially Google's into giving their or their customer's websites a higher rank for certain keywords. In order to achieve this goal, they try to create keyword-specific links wherever possible - and this means in your Wiki. In order to create a large number of links in short time, they write small software programs that know how your Wiki software works, and sends the correct request to create new pages or new page revisions. As in the movie "Transformers" Your wiki has become a playing field of robot wars. On the one side "destroy" are the spam-bots, on the other side the googlebot. In order to further familiarize with the way Wiki and Blog spammers think, I recommend The Register's "Interview with a link spammer".
  2. Do not be an attractive target: The best way of preventing Wiki spam is not being a target of Wiki spam. Spammers find Wikis vulnerable to SPAM attacks by searching on search engines for pages that already have been spammed by somebody else. A page that is spammed and found via a Google search is vulnerable and attractive, because the spammer knows, Google will see their spam. In order to not being an attractive target, it is important to remove all existing SPAM from the Wiki and make sure, SPAM is not going to be picked up by Google and other search engines. A mechanism that has been proposed to achieve this goal (and that has been found to be effective) is using the rel="nofollow" attribute in all links that could lead to SPAM. Some wiki software applies this to all outgoing links, some wiki software only to outgoing links that do not conform to a white list of allowed pages, some wiki software only to outgoing links on newly edited pages. The most important rule however is: Exclude all archived versions of wiki pages from being indexed. If your archived pages are being indexed, the spam will be picked up by the search engines, no matter how fast you are to revert the changes. Good techniques to achieve this goal are using the <meta name="robots" content="noindex,nofollow"> tag in the head of all history or archive pages. In order to further familiarize with learning how to exclude pages from being indexed, take a look at The Web Robots Page and Google's Webmaster Central Blog on using the robots meta tag.
  3. Use your community to fight spam: What is SPAM and what is legitimate content? As good as robots might be in creating SPAM, humans beat them by orders of magnitude in detecting SPAM. As your community profits most from your Wiki, you should invite the community to join your spam fighting efforts. This means, regularly observing the "Recent Changes" page, skimming through changes and change descriptions (SPAM robots seldom use change descriptions that fit to the usage patterns of your wiki), and reverting spammed pages to a clean revision. By selecting a Wiki software that has a "revert" or "rollback to last revision" feature, you are giving your users a powerful weapon in the fight against robots, because they can be faster in spotting the SPAM and clicking the link than most robots. If wiki spam is a major nuisance for you, you should engage in the Chongqed community, which is devoted to fighting SPAM in Wikis and retaliating against spammers (which I doubt is worth the effort). If you do not have a community that can help you fighting SPAM, you should probably disable editing in the Wiki or shut it down completely. Without a community, you will loose interest sooner or later as well, but spammers will continue to find your Wiki and attractive target.
  4. Ban content, not users: Lots of spam fighting techniques involve some way of banning certain requests, based on user agents, time of day, frequency of access, IP address range, etc. Other techniques require registration, use CAPTCHAs. All these techniques have a number of disadvantages, the most important aspects are that they create false positives, e.g. blocking legitimate edits that just happen to use the wrong user agent, time of day or IP address range, some like CAPTCHAs and required registration will even raise the barrier of contribution, leading to less legitimate editing attempts, so many users will not even try to contribute to your Wiki and - finally - they can be circumvented by a clever spammer easily. Especially IP address based blocks can be circumvented by using open proxies, dynamic IP addresses or botnets. The only thing that spammers cannot disguise is their intent to create links with specific targets and keywords in your Wiki. The most effective techniques are therefore based on banning content. This means banning URLs based on regular expression patterns (you do not have to build a database of these patterns yourself, there is an excellent one available at http://blacklist.chongqed.org/), content based banning based on regular expression patterns for text in the Wiki, e.g. for keywords (this will be more difficult if your wiki is devoted to gambling or erectile dysfunction medication) or even on the number of URLs posted in one editing steps or the URL-to-other-content-ratio in the post.
  5. Stay up to date: Staying up to date means keeping up to date with the version of your Wiki software, which might not only close bugs and create interesting new features, but also introduce new mechanisms to fight SPAM. And staying up to date means keeping up to date with new techniques used by spammers and ways to fight them. A good resource are the C2 Wiki (THE original Wiki) and the Chongqed Wiki.

Similar rules apply to other kinds of social software that allow user-generated content, especially blogs and social networks, but depending on your application the motivations and techniques of the spammers might vary.

There is a hockeystick curve, after all

posted 09:45AM Nov 20, 2007 with tags google opensocial plaxo socialnetworks by Lars Trieloff

Plaxo was one of the first social networks to adopt OpenSocial and according to a post at Mashable, this has proven a good decision: Plaxo Sees Exponential Growth as First to Use OpenSocial. The main advantage for Plaxo is that it can build a social network from a glorified address book by leveraging the ability to aggregating other social networks.

If you are on plaxo, just add me to your pulse, I'd be happy to see you.

The Company that kills Google will not be founded inside Google

posted 10:55AM May 25, 2007 with tags collaboration google predictions virtualorganizations by Lars Trieloff

Robert Cringely writes about The Final Days of Google. He makes up following equation:
"Gather a bunch of smart people, they will create new ideas, some good, some very good, some better than your original idea and you cannot recognize or pursue all of them, so they will be pusued by other people, who will kill you", then the threat to Google is not gathering smart people, because they can still leverage a fraction of the ideas their employees are generating."

The problem is, with modern distributed collaboration technology it is much easier to gather a buch of people, even smart people, without being a real company. There are more ideas generated outside Google than inside. When people outside Google can collaborate as productive as people inside by forming a virtual organization, they, together with the right idea have the potential to kill the cash cow.

Downloaded, tested, works: Spanning Sync

posted 03:52PM Feb 07, 2007 with tags calendar google macosx productivity by Lars Trieloff

Via Lifehacker: Spanning Sync is a small Mac OX X application that allows you to synchronize your iCal calendards to Google Calendar bidirectionally. Before Spanning Sync, Google Calendar offered iCalendar export which could be subscribed in iCal and iCal offered iCalendar publishing to a FTP or WebDAV folder, which in turn can be subscribed from Google. However you turn it - there was no bidirectional synchronization until now.

The Beta version of Spanning Sync available from today works for me, but if you try it you have to expect to pay for this useful service in the future.

| Comments[1]

Google is not your competition

posted 11:04AM Jan 04, 2007 with tags business google web by Lars Trieloff

"Google is not your competition, Google is the environment". Rich Skrenta writes in Winner-Take-All: Google and the Third Age of Computing how Google is has become the starting page for the internet, the favourite in online advertising and how it can use this market power to press into other verticals.

According to Rich, web companies should not try to compete with Google, but should try to align with Google and understand themself as Google companies that use Google search and Google adwords to drive customers, use Google checkout for the payment, Google Account Authentication for single-sign-in, and so on. In the end of this process Google will own the internet and you have to live with it.

The secret of optimzing your JRoller weblog for Google, del.icio.us and Firefox

posted 01:24PM Aug 01, 2006 with tags blogs delicious firefox google roller seo tips by Lars Trieloff

Roller is a great weblog software. This is the reason why JRoller and Goshaky Weblogs use this software and this is the reason why so many great bloggers are on JRoller. But the standard Roller templates get one thing wrong: They fail to set the correct title for individual weblog permalink pages.

Take for example the Bile Blog, which is one of the most popular blogs on JRoller. The title of the start page is "The Bile Blog", but if you turn to an individual entry's permalink page, you will see the title of this page, e.g. of Another googleturd is again "The Bile Blog". Why is this bad? The top-5 reasons are:

  1. Google cannot see the difference. The title of the page is important for Google's rankings and without a proper title, Google will not find out it is being bashed in Another googleturd, which means less visitors for the Bile Blog.
  2. Firefox cannot see the difference. Imagine you are opening three stories of the Bile Blog in tabs in your Firefox webbrowser. The title of all three tabs will be "The Bile Blog" and you have no chance to see the difference, e.g. if you would like to show your colleagues the latest Google-bashing
  3. del.icio.us bookmarks cannot see the difference: Many people are using the del.icio.us bookmarket to manage their bookmarklets. After clicking the bookmarklet and tagging the entry they are not reviewing the title, so their bookmark will be entitled "The Bile Blog", even if they are not bookmarking the whole blog, but a particular story. Take a search at del.icio.us for "The Bile Blog" and find out whether a link is pointing to the start page or an entry,
  4. It is not accessible. Most people do not care about making their site accessible, but most people with disabilities are actually using the internet. Setting the correct title helps them visting your weblog.
  5. You look like someone who is not able to customize the templates of his weblog system correctly, but with these instructions, it is no problem for nobody.

All you have to do is to login to your JRoller weblog. Click on Preferences, click on Theme, click on Customize (if you are not already using a customized theme), click on Templates and edit the Weblog or _decorator template. You need to find the text between <title> and </title> and paste following code:

#macro( showEntryTitle $entries)
  #foreach( $entry in $entries )
    #if ( $velocityCount == 1)
    $entry.title
    #end
    #if ( $velocityCount == 2)
    and more
    #end
  #end
#end

#if ($pageModel.weblogEntry)
  #set($entries = [$pageModel.weblogEntry])
  #showWebsiteTitle(): 
  #showEntryTitle( $entries )
#else
  #showWebsiteTitle()
#end

This will show the title of your current post on permalink pages and leave the start page unchanged. And, most important it will make Google, del.icio.us and Firefox users happy.