Fog Creek Software
Discussion Board




Finding a good algorithm

I am about at the stage where I am going to put a registration procedure into my program.

I have ready a lot of other stuff (topics on this board inclusive...S. Tanna you have given some great incites!).

So based on everything I have read I am going to do the following:
- Use a challenge/responseprocess, ie program generates code, user gives code to company, company generates a response code, user enters, all works (and yes I will automate this process).
- I am not going to use a third party application
- I will do what little I can in the code to make it 'not so easy to hack'. No big time wasters, just some simple tips.

So my question is, what algorithm do I use.
I have heard of MD5 and other algorithms. Do I need to use a complex algorithm or can I just write a scummy little algorithm of my own?

I have done heaps of googling to even find information on MD5 etc, but I know little about cryptology so I am lost here. I have mentioned it on a shareware newsgroup or two, but got no serious response.

How complex does this need to be?

Aussie Chick
Wednesday, February 18, 2004

Sounds like a good use for md5.  Libraries for using it are, I believe, freely available.

Ken
Wednesday, February 18, 2004

I'm not getting it.  How would MD5 be used?  What value are you hashing?

Dignified
Wednesday, February 18, 2004

It doesn't have to be complex, but you ought to consider also providing an online authentication mechanism.  That way, your users would never have to call you or email you (a fantastic idea if some of them are halfway around the world).  The system that I designed works this way (they can either enter a purchase code acquired from our online store, or they can purchase the product directly through the registration dialog), and uses the method you describe only as a last resort (no net connection available).

As far as encryption goes, you don't need anything incredibly secure, because your weakest link isn't the encrypted data -- it's the fact that your program code is on an "untrusted" computer.  If somebody really wants to rip you off, they'll just go in and write a 'jmp' over your registration check.  I'd suggest that you use an encryption method that you'll be able to maintain and which doesn't have a lot of arcane constraints.  Even something as simple as using a standard random number generator with a fixed seed to xor your cleartext would probably be fine.

K
Wednesday, February 18, 2004

Bruce Shneier's 'Applied Cryptography' is a good reference on encryption.  What you'll take out of it is that you don't want to roll your own.  It's harder than you think, and odds are you'll have borked something even if you think you did everything OK.

How important good encryption is is directly related to how important the data is.  If you just want to eliminate the stupid  then  XOR the password with something.  More sensitive data implies stronger systems.

If it wuz me I'd go to http://www.sourceforge.com and search for blowfish, DES, AES, and other encryption algorithms.  Another option is to search Amazon for books on Cryptography, then see if you can download any source code.

That last paragraph could get you into copyright/left/GPL/licensing issues, be aware of what you're doing.

Snotnose
Wednesday, February 18, 2004

Registration exists to keep the honest people honest. Your protection *will* get cracked, so don't try to build the be-all-end-all of registration codes.

The most important thing to do is make sure it's not a burden on your regular customers! Especially for $9; you've got to make the registration simple.

Personally, I'd recommend putting in a nag screen at loadup with a big "Buy Me!" button on it. Encourage people to give away the demo copies.

Also set it up so unregistered copies put "This reference generated by the unregistered..." into every document they write. The user can go back and remove it with Word, of course, but it's one extra step that's soooo easy to get rid of.

Chris Tavares
Wednesday, February 18, 2004

> The user can go back and remove it with Word...

Unless you apply a random password to the generated Word document.  I do that to invoices sent to our customers.

bpd
Wednesday, February 18, 2004

I use an MD5 hash in the registration for my application.  I found some C++ code on the net to do it and it works just fine.

I hash the users device code (it's a cell phone) and their name along with an extra string.  I then convert the MD5 sum into letters A-Z to create a sixteen letter code.

Aussie Chick, If you want the code for MD5 hashing, email me and I'll see if I can dig up the original code in my archives.

Almost Anonymous
Wednesday, February 18, 2004

>It doesn't have to be complex, but you ought to consider also providing an online authentication mechanism. That way, your users would never have to call you or email you (a fantastic idea if some of them are halfway around the world).

This is what I meant when I said “- Use a challenge/response process… (and yes I will automate this process).”


>I'm not getting it. How would MD5 be used? What value are you hashing?
*grins uselessly* okay so I am still learning, and when I said “I have heard of the MD5” I meant only that, I really know very little about it, about encryption etc, this is

>I'd suggest that you use an encryption method that you'll be able to maintain and which doesn't have a lot of arcane constraints. Even something as simple as using a standard random number generator with a fixed seed to xor your cleartext would probably be fine.

>Your protection *will* get cracked, so don't try to build the be-all-end-all of registration codes.
Absolutely, this is one thing I have picked up from other topcis/posts. I want to create a simple process, but not go overboard. This is why I want to know ‘will a simple XOR type process do?”

>Also set it up so unregistered copies put "This reference generated by the unregistered..." into every document they write. The user can go back and remove it with Word, of course, but it's one extra step that's soooo easy to get rid of.
I have considered this very same thing, just a slightly annoying process for that user who insists on resinstalling the trial version and hence never paying, but then the big question is “is that user going to buy it at ever?”, and the answer could be yes, because next time they are at the uni newsagent and they see that copy that they could buy for $9 they might think ‘what the heck’…but I believe this is the subject for a thread on marketing not registration processes! (Although, I am open for any marketing tips!!)

Aussie Chick
Wednesday, February 18, 2004

The login page to Yahoo! Mail has a javascript implementation of MD5.

Ken
Wednesday, February 18, 2004

AussieChick,

My post was meant to be "What algorithm do you mean?" because it sounds like you're trying to come up with a registration code, and then based on that value, give back an unlock code to consider the program registered.

To put it easily (and therefore innacurately), MD5 is a digest (Message Digest 5) algorithm that shortens (or lengthens, really) arbitrary length input into a "digest."  Which is basically a semi-unique (But NOT unique) value derived from that data.  The deal is you can't reverse an MD5 hash, and so to find out if two inputs are the same, you hash the test input, and compare the resultant value to see if they are the same.  If they are, then the input was the same*.  So I was kind of wondering where in your registration scheme you were going to be using it, because it will really only mangle the data, and produce seemingly random garbage.  So I guess my question is why don't you just use random garbage?** (or a GUID!)

* Yes, not necessarily the same, but in all probability it is.
** The advantage being that you could hash some values that wouldn't change (like last name) and generate that as their code, so that they wouldn't have to contact you if they installed on another computer and suddenly the approved registration key you sent them didn't work.  But then you have "did they use the same name this install time" problems anyway, so I don't know if it's that big of a life saver.  God I hate long posts like this.

Dignified
Wednesday, February 18, 2004

Okay, well I know enough to say that I am confused. I can see that using something like the MD5 is going to make things more complex (internally at least).

So do I need a hashing algorithm?
Or would just making up a simple XOR function do?

Aussie Chick
Wednesday, February 18, 2004

Once you have an MD5 library (check your email) then it's really not any harder to use than any other method.

It will be more secure than using a simple XOR method.

Almost Anonymous
Wednesday, February 18, 2004

> So do I need a hashing algorithm?
> Or would just making up a simple XOR function do?

That's your call ... would making life more complex for yourself (and more likely to produce errors for your customers) justify the extra security.

Sure MD5 is better, but the bottom line is will your revenue be higher with MD5 than without.

Personally I would go with a simple algorithm in this case.  Concentrate more on functionality and the extra sales will (hopefully!) more than offset those few people who would be forced  to buy because you used MD5 not XOR.

Rob Walker
Wednesday, February 18, 2004

I am looking into the MD5, but I might just agree with you, simple is better.

I could at least implement MD5 at a latter date if I felt the need for it…

Aussie Chick
Wednesday, February 18, 2004

Seeing as I got mentioned, I thought maybe I should add something

The things that I have recently discussed on this board about how to do it, are me thinking about alternate ways to do it.  Many moons before that I have posted on this board, about ways I actually did it.

I have done a number of shareware type registration systems. Some have been cracked.  A couple, despite being in fairly popular applications, have not. 

For one of these, there are numerous "cracks" out there, but all of them (and I have seen probably a dozen), cause (a) hey you've registered message, but (b) also cause the application to fail in its intended function.  This application is a special case which has methods of protection not available in a typical app, so I will not discuss further.

The other app is a more typical Win32 app. It has been out for long enough and listed in enough shareware sites that you would expect cracks. I just checked several of the crackz type search engines - none has a crack.

Essentially the trick that I use is to turn all the hacker's skills against them. It is not about the quality of encryption.

1. Consider an irreversible algorithm (as a simple example)

y = x % 100 ;
if ( y == 1 ) { AfxMessageBox( "You are registered" ) ; } else { AfxMessageBox( "Incorrect key try again") ; }

Now in a hacker will be able to find the message box call, and work out y should be 1.

What they do not know is whether x is supposed to be 1, 101, 201, etc.

For the purpose of generating the message box (which is the hacker's goal) these are all equally valid values of x (call this "sort of right")

However if x being 1, 101, 201, etc. has some non-obvious effect on the rest of your application, they can guess a sort of right value, but it's wrong in terms of making your app function correctly.

(Incidentally % is just that a simple example, there are many other things which are not reversible in maths or C, consider for example /, if (...) goto, etc., recursive algorithms, etc.)

2. What if instead of just x, and y it was (again a simple example)

y[n] = x[n] % 100 ;

Then instead of having to get a single value (x) exactly rightly as opposed to sort of right, they have to get many values exactly right.

3. Continuing with my simple example, the x[n]'s can be encoded into a string.  For example, using base 36ish (I use caps letters and digits, but omit letters which look like numbers and vice-versa e.g. O vs 0 and I vs 1, giving base 32 IIRC). 

On the end of the string use some check digits to check the entire string has been entered correctly, and no characters have been reversed or switched places.  This prevents legitimate customers inadvertently entering sort-of-right as opposed to exactly-right strings by mistyping.


4. Some further refinements:

(i) The irreversible steps I referred to in point 1 run into millions of instruction cycles.  They can not be simply taken out by a hacker as they result in critical data being set up during the process (and the intermediate values of the data themselves effect the app too, so you can't just save the end result). 

(ii) Imagine several onion like layers of 1 and 2.  Strip one off, and you are at another. Imagine also if you strip off an outer layer incorrectly it takes you to a whole different onion.

(iii) All sorts of misleading clues in the binary

(iv) An app need *not* be complete.  What if the user's data entry includes information for how to perform certain operations in the registered version of the app.  What if a sort-of-right key resulted equally valid (but incorrect in terms of producing the desired result from using the app) series of operations.

(v) All the above should not be localized in one place.  The preprocessor may be your friend in maintaining some degree of modularity.

S. Tanna
Wednesday, February 18, 2004

A quick note on using MD5 or any other hashing technique.  The way you use this for registration is that you use it to create a *signature*, by combining a serial number, and a private key.

Let's say that I want to say "AC is cool", and my private key is "Aussie".  I take those two strings and concatenate them, and then compute the MD5.  Let's say it's something like 5749057493755743957.  This number is now "proof" that I said "AC is cool".  In registration terms, I could replace "AC is cool" with a product serial number and the name of the person the copy is licensed to.  The MD5 is now proof that I authorized that serial number and name.

This technique is mainly useful for online schemes, where the validity is checked on the server.  It doesn't work well for validation on the client, because then the client has to know the private key ("Aussie").  This means that a cracker can find out the private key, and then issue as many registrations as they like, or write a key generator and distribute it to people.

So, for your situation, a simple hashing approach doesn't work.  You need something more like a public key algorithm, where the key that's used on the client doesn't match the key you use to generate the registration code.

Phillip J. Eby
Wednesday, February 18, 2004

The problem with Eby's approach for client side is more fundamental

Whether the keys are private or public keys or MD5 or whatever, it's going to boil down to something like (in C, obviously hacker will be in assembler)

BOOL IsKeyOkay(etc)
{
// do the check on the key
// lots of complicated stuff
if key_matches...etc
{
return TRUE ;
}
return FALSE ;
}

AND

if ( IsKeyOkay(etc) )
{
  DoRegistrationOption(etc) ;
}


The problem with this is the hacker has access to your binary, and simply can change to the equivalent of


if ( !IsKeyOkay(etc) )
{
  DoRegistrationOption(etc) ;
}

(this is simply changing a jne to je or vice-versa, a single byte change)

or

BOOL IsKeyOkay(etc)
{
return TRUE ;
}


It actually gets worse because they may not even need to change the executable.

If they actually look at the IsKeyOkay function ... they simply run thru every possible value (which is usually less intimidating than it might seem as serial keys etc., are usually limited length and valid possible characters) until they get the right one to trigger it returning OK.  Hence the key generator

S. Tanna
Wednesday, February 18, 2004

There's some good anti-codecracking tips at http://www.inner-smile.com/nocrack.phtml (like ways to obsfucate your code).

Cubist
Wednesday, February 18, 2004

There is a good article called "Encrypting data with the Blowfish algorithm" on embedded.com. It is quite detailed an includes some code examples.

http://www.embedded.com/showArticle.jhtml?articleID=12800442

Guy Eschemann
Thursday, February 19, 2004

"Personally, I'd recommend putting in a nag screen at loadup with a big "Buy Me!" button on it. Encourage people to give away the demo copies."

This is for students. Your sales will be 0.0 if you do it this way. How many students do you know that registered for Winzip?

First step back and do a quick threat modeling. What is it you are trying to protect yourself from? Wire snooping? Key sharing? Key generators? Only then can you consider (partial) solutions.

As for the cryto stuff: do not try to invent your own!
There is plenty of support in the base platform that you can use out of the box (look at CAPICOM and CryptoAPI).

Just me (Sir to you)
Thursday, February 19, 2004

Thanks guys.

First, no I wouldn't dream of a nag screen. They are evil.

Secondly, the main people I want to protect myself against is the uni studnet who thinks 'hey this is good' and then passes it on to ten of his mates, who in turn...
Okay, so maybe not as viral, but the point being it is a cheap program that will be easy to buy, I don't want to lose all my potential buyers because there mate has a copy for free.
I am not really worried about hackers. It is currently a small-fry program, even if it was hacked cracked and roasted, well good luck somebody finding a copy, really is it worth two hours of looking for a cracked copy when it costs $9 at the student newsagents?

I have gone with using the mac address, running it through a XOR algorithm, and returning it to the user (so basically a codey-looking MAC address).
That combined with the user name, is hased through an MD5 algorithm (possibly overkill and/or used incorrectly, but it is being used as much so that I can learn). The result is stored internally.

The user enters the codey-looking MAC address and username onto the website (or some other more automated method), the website responds with the 'result' code, which the user then enters into the program. When the program gets the match all runs well.

From the server side I will be able to blacklist any names that are being used repeatively, I will probably also require some sort of registration key to ensure that the same cd is not being passed around.

So this is slightly annoying, but heck we are uni students we are used to filling out miles of forms, and I really think most users are used to registration processes now. It is part of the software process.

Aussie Chick
Thursday, February 19, 2004

Oh, and on top of this I will implement alot of the 'hacker prevention' methods I have been reading about.

Again, mostly overkill as I don't think hackers will be my big problem, but again good experience.

Aussie Chick
Thursday, February 19, 2004

Using the MAC address will tie it to the installed machine and apart from occasions where the network card is changed do you really want to restrict allowable use to a single machine? 

And in a university environment, single machine use might mean many people.

Simon Lucy
Thursday, February 19, 2004

I'll second Simon's comment -- please think twice about using the mac address.

My laptop gets regularly switched between a wireless card (at home) and a wired ethernet card (at work).  What is it's mac address?

Rob Walker
Thursday, February 19, 2004

>Using the MAC address will tie it to the installed machine and apart from occasions where the network card is changed do you really want to restrict allowable use to a single machine? 
>And in a university environment, single machine use might mean many people.


This actually works very well for the type of security I want.
My main concern is sharing of registration information, the best way to stop that is to ensure the company gets contacted every time an installation is done.

Although I didn't realise there was potential for a MAC address to change on a regular basis (ie aside from re-formats and windows re-installs).

Aussie Chick
Thursday, February 19, 2004

> Although I didn't realise there was potential for a MAC
> address to change on a regular basis (ie aside from re-
> formats and windows re-installs).

In fact, the MAC address is baked into your network card.  It will persist over a reinstall of windows.

Many cards also have utilities to set the MAC address, so it isn't as globally unique as people tend to think.

Something else to watch out for with MAC address based schemes is which MAC address do you mean?  A machine may have multiple MAC addresses at once.  For example, two network cards.

You need a scheme that doesn't break because the user added a wireless card to their laptop and that card is now the first reported MAC address, and the one you were expecting is now the second.

Applications like VMware also install 'virtual' network adapters that have MAC addresses.  You need to ensure that if the user installs (or uninstalls) some third party piece of software yours doesn't break.

Nothing is ever as simple as we'd like it to be!

Rob Walker
Thursday, February 19, 2004

Yukko.

But if not the MAC address, then what?

Aussie Chick
Thursday, February 19, 2004

Metrowerk's Codewarrior uses the disk serial number -- which I believe will change on a reformat but is otherwise constant.

If they reformat their drive, they'll have to reinstall your application anyway so that's a perfect time for re-registration. 

(Windows also does funny things with the MAC address when you don't have a network card)

Almost Anonymous
Thursday, February 19, 2004

If you want to lean toward minimal impact on the user, at the risk of some piracy, then how's this:

When the user regisers the software they supply an email address.

To activate the application generates a random piece of data and encrypts it using a key derived from the email address.

The user enters this into a web page form along with their email address.  The server decrypts and returns the original random piece of information. (For extra credit implement a web service and save the user the manual step).

You get to put a limit on the number of registrations you accept from a given email address before putting up a 'please contact us to explain why you need to install this software so many times' message.  As part of the process the server tells the application how many times it has been registered and this is reported to the user (so they know you are tracking it).

The advantage of this approach are:
  - it stops totally trivial copying
  - the user is aware you are tracking registrations

The disadvantages are:
  - anyone who reverse engineers the algorithm can generate their own keys
  - the application needs some kind of 'installed OK' flag somewhere to decide if it is already registered.  Anyone discovering what this is can set it manually.

Alternatively, look for a 3rd party component / web server solution.  Do you want to have to host the server side yourself?

Rob Walker
Thursday, February 19, 2004

*  Recent Topics

*  Fog Creek Home