Fog Creek Software
Discussion Board




Knowledge Base
Documentation
Terry's Tips
Darren's Tips

anti-linkrot suggestion (CityDesk)

This is the second time I have raised this issue.  (The first can be found here:  http://discuss.fogcreek.com/CityDesk/default.asp?cmd=show&ixPost=12909)  Bear with me on this...

It still concerns me greatly so I am raising it again.  Here me out:

My plan is to have an evolving website.  Evolving in terms of categorization, that is.  I wish my site did not have to evolve, but due to the limitations of how many articles my partners and I can publish, evolution is an unfortunate necessity for us.

For example, at first there may only be one article relating to computer graphics.  However, after a few are written, I will have to create a specific "computer graphics" category. Eventually, "computer graphics" will have to be subdivided into "programming" and "modeling".  "Programming" could conceivably be subdivided into "real-time" and "pre-rendered".  Etc.  These subdivisions could get fairly deep and are impossible to predict as they depend on the content which evolves.

In CityDesk, an evolving website involves moving the articles around which changes and *kills* the old URLs.  This is known as "Linkrot" and it is one of the Internet's biggest problems:
http://www.useit.com/alertbox/980614.html

I want to avoid linkrot.  But, how?!

To be explicitly blunt, I do not want to hear about manual solutions.  I want my publishing software to handle what it is suppose to handle (website, code, links, etc.) and I want to handle writing the articles and categorizing them as needed.

MY SUGGESTION:  CityDesk automatically creates forwarding links.

MY SUGGESTION #2:  CityDesk does not change URLs of moved articles.

Please understand that my website categorization *has* to evolve.  It is not a case of poor planning.  The plan *is* to have an evolving website that will categorize itself as necessary.  I do not want a website that tries (and fails) to be "big" right off the bat.  You know those sites.  They have every category imaginable with half of them void of content.

Needless to say, a solution to the linkrot problem is very important to me and I appreicate all comments.

Matthew Doucette
Wednesday, December 08, 2004

Re: In CityDesk, an evolving website involves moving the articles around

Couldn't you just leave the articles where they are, give them a keyword corresponding to the divisions and subdivisons that 'evolve' and then filter on these keywords? No moving around, no 404 error there!

Ruud van Soest
Wednesday, December 08, 2004

No, I cannot give them a keyword corresponding to the subdivisions.  This would entail knowing what the future structure of the site is.  If I could predict this future structure, then there is no problem!  How?  I would categorize everything perfectly right off the bat (but just not have my website display the full categorization until necessary.)

Matthew Doucette
Wednesday, December 08, 2004

The more I read...

http://www.useit.com/alertbox/980614.html
http://www.w3.org/Provider/Style/URI
http://www.useit.com/alertbox/990321.html

...the more I agree that the URL of an article should never, ever, change.  (To re-cap, this was #2 suggestion.)

My #2 suggestion is better than my #1 (CityDesk creating forwarding URLs.)  From the user's point of view, a forwarding URL is definitely worse.  Sometimes it interferes with the back-button, which is a definite no-no.  From a webmaster's point of view, who wants to link to a forwarding URL?  What webmaster wants to update their links?

URLs should never die.

Matthew Doucette
Wednesday, December 08, 2004

HTML DbScript can help. It means using a database that you customize and putting the content there. Then you can use lookup tables to add your categorizations as you need them.

Well, that's one idea for you!

You should consider an open source solution

Bob Bloom
Wednesday, December 08, 2004

Your "HTML DbScript" sounds like it is what I need.  But, it also sounds like CityDesk.  I do not see the difference.  (I may be blind!)  Perhaps you could elaborate for me, please?

Matthew Doucette
Wednesday, December 08, 2004

Can you use keywords to expand your categories? An article could stay where you created it while it's keywords direct it into article index pages and menus.

That is working very well for me on a couple of sites.  These sites are absolutely unpredictable. In this one Mac said, "Were getting a lot of e-mail about the "Voltaire" can you put list them all one one page?" Keywords to the rescue.

http://ahoy.tk-jk.net/macslog/TheVoltairePages.html

tk
Wednesday, December 08, 2004

Keywords... perhaps.  Let me think...

If the category and 'location' of an article depended only on its keywords, then I could modify the keywords to move an article as my website evolves.

The downside is this is still a manual fix.  CityDesk already offers a full directory structure that I can *see* and understand.  I can drag and drop articles with ease.  It is very intuitive.  Modifying the hidden keyword field is not.

...

CityDesk is perfect except for its linkrotting ways.  I come back to this, but I cannot ignore it.  It is perfect for moving and organizing your articles after you have written them... except the moment you do, it KILLS the URL.  It is just wrong.

Matthew Doucette
Wednesday, December 08, 2004

You could move the document to it's new directory then paste a copy of it back to where it came from.

tk
Wednesday, December 08, 2004

It's a problem. I'd rather not have CityDesk automatically manage redirects etc for me, as I don't always want to preserve URLs (e.g. if the site is new and has no inbound links anyway). Maybe whenever you move an article to a different folder, CityDesk could ask if you want to create a redirect?

In the meantime, I just manually create my own redirect articles as explained here (kind of):
http://citydesk.pool-room.com/Migrating.html

It really only takes a few seconds per article, and it's not like it's something you do often. When rearranging your site structure, you'd be giving it some careful thought anyway, so a little extra time taken to preserve URLs won't hurt.

Darren Collins
Wednesday, December 08, 2004

Matthew,

Re: No, I cannot give them a keyword corresponding to the subdivisions.  This would entail knowing what the future structure of the site is.

That is not the case, you can incrementally refine your keyword structure over time.

Re: The downside is this is still a manual fix.  CityDesk already offers a full directory structure that I can *see* and understand.  I can drag and drop articles with ease.  It is very intuitive.  Modifying the hidden keyword field is not.

What is unintuitive about typing the category or  subcategory that an item belongs to in stead of drawing it to another folder?

What do you gain by drawing it to another folder and by doing so creating the problem that you subsequently have to solve?

Keywords are meant  to be used for these purposes and they work fine in this kind of hierarchical category systems.

Ruud van Soest
Thursday, December 09, 2004

In addition, what do you do when an item belongs to more than one category? You'll have to duplicate, triplicate, or quadroplicate (?) the item, potentially introducing synchronisation problems.

You might also consider an online database and make it searchable via an ASP or PHP form. One of the limitations of desktop cms is that you have to define the search options in advance. The results of a query are generated offline and then sent to the webserver. This gives a practical limitation, because if you have ten categories you don't want to make search pages for all combinations between them - guitar music AND composition AND Caetano Veloso AND etc. With an on-line database visitors can define their searches themselves on the fly. So, as far as I see, desktop cms puts limitations on the complexity of searching.

Ruud van Soest
Thursday, December 09, 2004

Regarding making article copies and/or multi-category articles:

Articles should only exist in one place on the Internet (for optimal popularity) *and* they should only exist as one copy to be edited (to avoid synchronization issues).  CityDesk would have to show instances of articles found in multiple categories.

Perhaps the keyword solution is the best.

Even if it is, it still does not change the fundamental flaw of CityDesk's linkrotting.

Matthew Doucette
Thursday, December 09, 2004

"I'd rather not have CityDesk automatically manage redirects etc for me”

By default, CityDesk should prevent whichever is more harmful:  1) Killing a link to an active article or 2) leaving a link to an unwanted article.  #1 is more harmful.  Therefore, the default settings should prevent it.  #2 is solved by giving an option to the user (as Darren posted).

The best solution is to avoid redirects altogether.

CityDesk should never change the URL.  No redirect management necessary.  When you delete an article, CityDesk should give the option of deleting the URL or leaving a message:

“Sometimes Web content becomes truly obsolete. An example would be the advance program and registration form for a conference that has already taken place. In such cases, it makes sense to remove the original page. Even so, the URL should still be kept alive and should be redirected to point to either a follow-up message”
- http://www.useit.com/alertbox/980614.html

However, sometimes you *do* want to delete a URL, such as in testing phases.

Matthew Doucette
Thursday, December 09, 2004

This is odd (and I do not mean to pick on you Darren)…

"In the meantime, I just manually create my own redirect articles as explained here (kind of):
http://citydesk.pool-room.com/Migrating.html"

I searched "linkrot" in this forum and found two results, both threads link to an older version of your article:
http://www.pool-room.com/CodeCraft/CityDesk/Migrating.html

As you can see, it has suffered from linkrot, and I was unable to find the article.  This demonstrates my point exactly.  I wanted to read an article that existed, but could not.  This is the problem.  If CityDesk never changed the URLs, this problem would never occur.

Again, no offense was intended towards Darren.  Linkrot is everywhere and every website has suffered from it.

Darren, is the result of the (outdated) link above an example of the solution posted in the same article?

Matthew Doucette
Thursday, December 09, 2004

Regarding keywords being unintuitive.

Structuring using keywords is unintuitive because you cannot easily see the structure.  Also, the structure visible in CityDesk would be *completely different* from the structure visible on the website.  They should match.  (Right?)

Matthew Doucette
Thursday, December 09, 2004

If I was you, I'd structure the content by date and put any categorisation on top of that using keywords.

That's how I developed my weblog (http://weblog.janek.org) and it works really well.

I understand your desire to avoid link rot. I care about that too. Above structure -- together with a .htaccess file - helped me to to prevent it.

Janek Schwarz
Thursday, December 09, 2004

"I'd structure the content by date and put any categorisation on top of that using keywords."

I agree.  Until CityDesk solves its linkrotting problems, this is the best solution.  (Thanks to all of you who suggested to use keywords.)

...

To elaborate, this is what I was thinking:

I looked towards the man who knows web design best, Jakob Nielsen.  If you look at all of his articles (http://useit.com/alertbox/) they all exist in the *same* directory (which eliminates linkrot forever) and they are named after the date in which they were written.

Then, using keywords, I can structure the site properly.

My last problem was how to have the articles show up in CityDesk.  Janek Schwarz just solved that for me.  Just organize it by dates!

Thanks.



I still believe CityDesk has two flaws.  Please comment on these:

1) Moving and re-organizing articles is so easy but creates linkrot.  Linkrot should never happen.

2) To avoid linkrot, CityDesk’s visible directory structure has to be based on something other than the structure of the website.  Isn't this wrong?

Matthew Doucette
Thursday, December 09, 2004

You're correct - that linkrotted article of mine that you found is from before I restructured my site. The reason it was allowed to rot is a mix of history and laziness.

The main reason I restructured was because CityDesk at the time (the Home version) had a 500 article limit, and I'd hit it. I didn't want to pay an extra $200 or so to upgrade to the Pro version (the only 'pro' feature I wanted was the ability to post more articles), so I split my site into separate .cty files that each published to a separate subdirectory on the server.

When I did the restructure, I created redirects for the main incoming links I had. Because I had already hit the article number limit, I couldn't create redirects for every article (every redirect is an article). So I didn't create redirects for the articles that didn't get much traffic from off-site links. I know it's not ideal, but I decided to live with it.

Once Fog Creek upgraded all Home users to Pro for free (thanks guys!), I simply couldn't be bothered going back and creating all those redirects. I'm much more careful nowadays, though :-).

By the way, all my separate sites now publish to subdomains of my main domain. So I have photo.pool-room.com, citydesk.pool-room.com, cdfaq.pool-room.com, ds.pool-room.com, etc. It turns out to be a pretty good way to manage a site like mine, as each .cty file is a manageable size instead of having one monolithic .cty for everything.

Darren Collins
Thursday, December 09, 2004

PS - I've now made that URL redirect properly :-).

Darren Collins
Thursday, December 09, 2004

Darren,

why don't you use a .htaccess for redirecting links? Just curious.

Janek Schwarz
Thursday, December 09, 2004

Because the philosphy of CityDesk is that everything lives in the client side. If I want to change stuff in .htaccess, I need to FTP it off my server, update it, and FTP it back again. So I need to remember to keep a backup of that file along with my .cty files (and know which site each .htaccess is for). And if I change servers, I have to remember all the places I had customisations or else they'll break when I upload to my new server.

Also, if you do it using .htaccess, you can't preview the behaviour locally before uploading it to your server.

I know that .htaccess is the 'right' way to do it, but it's easier and simpler for me to just keep everything within CityDesk.

Darren Collins
Thursday, December 09, 2004

Darren,

Why not keep the .htaccess file in CD, edit it in CD, and let CD manage pushing it to the server when you publish?

?
Thursday, December 09, 2004

Thanks for the explanation, Darren.  It proves that your case of linkrot could not have been avoided without you upgrading, so it does not really help my point! :)

But, to blast the point home one more time, a CMS should not produce linkrot during site management assuming you are using the professional (unlimited) version.

Matthew Doucette
Friday, December 10, 2004

Darren,

I manage the .htaccess file in CityDesk.  I don't use FTP to do that.

I understand you concern with previewing. The thing is, preview does not work for me anyway. Therefore, I have a special "preview" location in the publish dialog that publishs to a server in my home lan (*). This  nicely fixes the problem with previewing redirects when using .htaccess.

Janek.
(*) http://www.apachefriends.org/en/

Janek Schwarz
Friday, December 10, 2004

I agree with you all, .htaccess is the proper way to do this stuff. But it doesn't work when previewing a site locally, and it's a hassle to write out the original and the redirect URLs in the .htaccess file when I can quickly and easily create a redirect article in the directory I move a file from.

There's also a certain amount of inertia to go back and change all my subdomains to use .htaccess when it's working good enough as it is :-).

Mine is just a dinky-doo personal site, so convenience often rules.

Darren Collins
Sunday, December 12, 2004

*  Recent Topics

*  Fog Creek Home