Fog Creek Software
Discussion Board




Anyone know how Sabre and Apollo work?

I have heard for a long time the airline
reservation system special needs. I have
always wondered what are the key ideas
in database programming that allows for
the proper completion of complex transaction
at such a high rate of commits and updates
per second. Is there something we can borrow
from these designs to help very complex web
services scale better? Any ideas are
welcome.
Anyone know of how any other complex
databases out there? Intelligence agencies?
Large data warehouses? Any insights in how
they scale?

Any direction to reading materials talking
about this stuff in depth will be most
appreciated!!!

Li-fan Chen
Tuesday, February 25, 2003

Optimized machine language code and complete avoidance of a GUI has a lot to do with it I think

Stephen Jones
Tuesday, February 25, 2003

Search for information on IBM's "TPF," which evolved from ACP,  the operating system created from the SABRE project.

SenorDingdong
Tuesday, February 25, 2003

The 'huge mainframe' approach is very expensive. See http://www.paulgraham.com/carl.html for a much more 'Google' style approach (distributed system running on a large farm of cheap commodity hardware).

Chris Newcombe
Tuesday, February 25, 2003

The approach orginally was to reduce the OS overhead, code in assembler and keep everything simple. I once heard that they used disk striping to improve performance but to add new disks they had to unload the data and reload it.

The hardware was also specially modified IBM mainframes. There was concerns about Y2K as the machines are old and they did not want to move to newer models.

I think there are some compilers now for the system but it is not a general purpose system.

John McQuilling
Tuesday, February 25, 2003

Would I be on the right track if I studied
how to use or study the inner workings of
databases like KDB and K language from
Arthur Whitney? Another keyword to use seems
to be "High Volume Transactional Processing"
and "Large Database"-blah blah. I am having
difficulty finding articles explaining how
the KDB and K language do what they do. But
it's a start, and maybe the right direction.

-- David

Li-fan Chen
Tuesday, February 25, 2003

I have never heard Adabas and the language NATURAL discussed here, but they are capable of this kind of processing.

See Software AG's web site:

http://www.softwareag.com/adabas/


Tuesday, February 25, 2003

No need for Assembler here. A friend of mine was a Project manager on a large Java project, they have built a Google-like search engine in Java. This engine works in a distributed system of dozens of hundreds of Intel computers, all components of this engine are clustered and distributed, you can uses lots of Crawlers or Indexers to scale the engine. System uses custom database built in Java.
Good architecture is a key, that's all.

Slava
Wednesday, February 26, 2003

Should be: dozens or hundreds or thousands of Intel-based computers

Slava
Wednesday, February 26, 2003

Sabre / Galileo are in a GDS/CRS system not a database. They may use a database within, but their main purpose is packet switching. Basically a GDS/CRS is a network, in some ways similar to www, between agencies and carriers.

For transaction processing I recommend "Transaction Processing: Concepts and Techniques" Gray, Reuters.

Table of contents ...

http://www.amazon.com/exec/obidos/tg/detail/-/1558601902/ref=lib_rd_ss_TC01/102-2075276-0295325?v=glance&s=books&vi=reader&img=4#reader-link

Dino
Wednesday, February 26, 2003

Slava,

you can build a Google-like thing out of tin cans and strings, that is not the problem. The reason you still want efficiency even if the problem can be parallelized to a very high degree is that you would rather run it on "dozens" than on "thousands" of machines.

Just me (Sir to you)
Wednesday, February 26, 2003

>>you can build a Google-like thing out of tin cans and strings, that is not the problem. The reason you still want efficiency even if the problem can be parallelized to a very high degree is that you would rather run it on "dozens" than on "thousands" of machines. <<

It depends on what option will be more economically more efficient: to create everything in Java and run thousands of cheap PCs or create everything in Assembler and run hundreds PCs or one mainframe or supercomputer.
It seems to me that Java+cheap PCs approach is simply more efficient in economic terms: yes, you'll spend more on hardware, but much less on software developement and system maintenance.

Slava
Wednesday, February 26, 2003

Important note: yes, Java wins the economical contest most of the times, but sometimes (read: when you are limited on hardware) C or Assembler will be the winner.
Another example from my experience: SMS service software. It was built in highly optimized C, because we weren't able to use many computers -everything had to reside on one specialized system.

Slava
Wednesday, February 26, 2003

In the carl.html article at Paul Graham's site, it mentions the read only files that contains the memory-mapped structures of fees and flight info for sale.

My question is, the guy says it's read-only.

So how often do they unmap the file.. download the latest file..and remap them?

Li-fan Chen
Wednesday, February 26, 2003

*  Recent Topics

*  Fog Creek Home