Fog Creek Software
Discussion Board




Filtering Mechanism in Access 2000

Hi Guys,
I'm wondering if someone can help me on this.
I have an access database that has a text field named "Description"
I want to do a vb module in access (or anything else) that would act like a filtering mechanism. Ok here is the problem..
A description field would have something like this

PIPE, 6", Sch. XS, BE, SEAMLESS, ASME B36.10, Charpy per ASME B31.3, Section 323.3 at -45°C, NACE MR-01-75

Now I need to divide the output into 4 parts called Pipes, consumables, Valves and instruments (this might mean 5 tables, one for unsuccessful ones). For example, the above has 'Pipe', I need to be able to extract 'PIPE' (as well as its accompanying fields in the record) and display it seperately from fields that would have terms like 'valves', 'instruments' and consumables.

But there is another problem, sometimes they do not appear as pipes, (yet must come under pipes)
For example
ELBOW, 2", 90°, LR, Sch.XXS, BW, ASME 16.9, Charpy per ASME B31.3, Section 323.3 at -45°C, NACE MR-01-75

Now 'elbow' is a name of a pipe that must be grouped/categorised with pipes. Other terms come under Pipe, eg Flange, Bend, Reducer, TEE etc..

The same applies for Consumables that has a sub-term like 'gasket'.

Another dilemma that has been puzzling me, in valves you have some items called 'Valve instrument', now this must come under valves and not instrument.

Help on this would be greatly greatly appreciated, thankyou in advance.

Mark
ps I can send the current access database to show how far I've gotten with it.

Mark
Sunday, May 18, 2003

If your data text block is ALWAYS divided into 8 fields, then this is a reasonably workable problem.

This is really a parsing problem. You would first break out the text block into 8 separate fields. That is a easy 5 minute job.  (every programmer I know has some “delimiter” routines to pluck out the “n” value from a delimited list). If by some strange event you as a developer don’t have some delimiter code hanging around, then you can use the built in vb split command if you want also. (access 97 does not have that, but access 2000 and beyond does).

However, if the “,” that separates the data is not organized, then your problem is

                    GIGO

The above problem here is the oldest term in our industry: GIGO

GIGO is a term used when one person in a company calls a pipe a pipe, and another person calls a pipe a thing a ma do.  You are wasting everyone’s time until a common language, and a common definition of what means what is defined here. A ugly block of text in a field does not represent any kind of organized data processing at all.

By the way, that term GIGO was invented at least 40 years ago in our industry:

GIGO is a term for
Garbage In = Garbage Out.

If you have not control on how that “text” block is being created, then you need to get control over that.

Of  course, as mentioned, if the 8 fields (separated by a comma) is consistent, then I don’t see the much of the problem here.

If the data is consistant, then you can spend some time massaging the data into something more worth while. A few hours later, your data will be normalized.

Often, when one tries to computerize a process in a company, if that process is not defined at all well, then the result of computerizing the process is not at all going  to be well at all either.

If you cannot ensure a consistent naming convention for that ugly block of text, and further you can’t assume the number of fields separated by “,” (comma), then I am at a complete loss here.

You can, and should certainly create a table of synonyms , so that you can break out this data into a proper normalized data table. So, perhaps creating a simple table of categories so that

Name            Category
ELBOW        Pipe
Pipe              Pipe
Flange          Pipe

So, you certainly can build a above table to “translate” what those names are “supposed” to be, but human will have to do that.  With the above type table, it then it is easy to start normalizing your data. And, with normalized data, you defiantly should NOT have 5 different tables, but one table, and a extra field that tells you if it is a valve,  consumable or a instrument.

You might want to stop all code and data design here, and sit down before you start working in ms-access.

I would read a few articles on data normalizing. Here is are few:

ACC2000:
Understanding Relational Database Design" Document Available in Download
Center
http://support.microsoft.com/default.aspx?scid=kb;EN-US;234208

ACC2000: Database Normalization Basics
http://support.microsoft.com/default.aspx?scid=kb;en-us;209534

There are also a ton of ms-access newsgroups out there, and for most questions on ms-access, you will find those groups your best bet.

And don't forget : www.mvps.org/access

The above has a ton of useful code examples.

Albert D. Kallal      (Microsoft MVP [ms-access])
Edmonton, Alberta Canada
kallal@msn.com
http://www.attcanada.net/~kallal.msn

Albert D. Kallal
Sunday, May 18, 2003

No offense Mark, but isn't this the Excel to Access conversion that folks helped you with a few threads earlier.  In fact I believe some of them actually gave you a working program for conversion.

Now I'm not trying to be rude here, but How in the hell did you get a job where you don't know what you're doing?  I mean WTF?  Some folks on this board don't even have jobs and you come to us with data normalization questions and how to do this and that?  The fact is your problems are somewhat simplistic and even a recent college graduate should be able to figure it out.

I'm not so sure I could swallow helping you for free.  How does $150/hr sound?

Pained
Sunday, May 18, 2003

To Pained,
No offence taken. I think I've asked this question out of despair. Just been under alot of pressure. I'm experimenting with queries at the moment. My position isnt actually db developer its web designer, as they db guy has left and there trying to find a replacement. Although i do have a basic access skills. And I'm reading up in my spare time on databases, as much as I can. And I'm currently on probabation, so they pay me peanuts. I might as well be working for free, it's just the experience I'm after.

Mark
Sunday, May 18, 2003

And honestly, if you do not want to help, then I wont be offended, it's a forum. Its a choice one makes to help one another or not, without neccesarily having to understand the circumstances behind the question. It is not like I'm asking for my work to be done, trust me, what I'm asking is a small part of my work.

Mark
Sunday, May 18, 2003

Dear Albert
Thank you very much for your detailed help. Its just when i was given this project I assumed I can start from scratch, but its importation process that has thrown me off course, and its only now that I'm gathering pace. I do undersrtand the basics of the normalisation process, and I'm working on it now. Also, I've decided I'll use append queries, and run them via asp.

Mark
Sunday, May 18, 2003

Dear Mark,
                Pained is evidently annoyed over something. I looked at your spreadsheet and the data was in a mess that stopped it from being converted easily.

                You're really in a mess here because you have been given a messy rewrite job when you would have enough on your plate designing something outside your real field of experience from scratch.

                    As Albert says, what you need to do now is to sit down and design the database as if you didn't actually have any data there at all. When you are sure of the basic design, then you can go around moving the data into the correct place in the tables.

                  One thing you ought to do is to stop thinking about an automated solution and start thinkiing about what the data really means. When you have decided that you will probably find that you can do the job piecemeal.

Stephen Jones
Sunday, May 18, 2003

Dear Stephen,
You do have a valid point. Its very frustrating, I know the objectives of what i have to do. But its not like i was given something to start from scratch, hence the need i sought for automated solutions to actually bring the data into the right place so i can fulfil my objectives. It's all too unprecedented.
But I'll spend the day doing just that.
Thankyou for your kind response
Regards
Mark

Mark
Sunday, May 18, 2003

Hi Mark,

Wow, what a mess eh?  Well take comfort in the fact that this sort of thing happens in IT departments all over the world.

Anyway.  If I were you, I would start from scratch.  Take a couple printouts of the spreadsheets and identify the entities or objects that you find.

For example, when I look at your spreadsheet I see the following entities:

1. A Project
2. A List of parts used to build the project
3. A Description and Specification of each part

Now for each of those entities you need to ask a question?

How is each entity related? (If at all)

For example a project contains a list of parts.  Can it contain more than one lists of parts?  Looking at your spreadsheet I see "Fabrication Materials" and "Erection Materials".  If both of these are lists of parts then you could say that a project contains two lists of parts, one list contains the Fabrication materials and the other list contains the Erection materials.  Now what is the difference between Fabrication and Erection materials?  Fabrication materials are the individual parts that are assembled and held or "glued" together by the Erection materials. (For all I know :-)

Sooooooo... What can we gather from all this mumbo jumbo? 

We have to make precise statements about the nature of each entity we just defined.

1 Project can contain 1 or more Parts Lists
1 Parts List can contain many different Parts
1 Part can be classified according to its category

Just remember keep asking questions.  "If I could ask Project Number 1 how many Parts Lists it contains it would tell me that it can contain zero, one or many different parts lists."  In doing so you establish relationships among the entities. 

The entities become tables in your database and the questions answer the type of relationship they share.

As for parsing the description you could use the "split" function and split the string on the "," or you could simply write a little parsing routine that accumulates the letters of the current string until it hits the comma.

Hope this all helps you and I sympathize with you as to what you are trying to do.

Dave B.
Sunday, May 18, 2003

Mark,

You appear to be in a real dilemma and not getting much support. If I read your situation correctly, the basic problem you seem to have is that you need first to create a sensible database structure with which to work, which you don't have experience in doing, and then to load it from an existing database. Depending upon the consistency with which items have been described in your existing database, the latter may be very difficult to do.

The former, however, is a very well researched area. To start with take a look at ISO 15926. This is part of the ISO STEP (standard for the exchange of product data) series and is a complete data model and reference data set for describing all forms of process equipment.

If you google for it you'll find quite a lot of detail. It may not solve you problem, but even if you only spend a morning getting an overview you'll probably end up with a clearer picture of how to describe equipment and components.

David

David Roper
Sunday, May 18, 2003

To Dave B.
LOL, trust me man, I've actually taken comfort in the fact that this happens around the world. Atleast now, i know what to expect in future. Some managers just assume too much, they don't neccessarily realise that there are different streams of I.T.
Anyway, it's not as complicated as it looks. The highlighted parts of the fabrication materials, is irrelevant, its just the fields that have to be imported.
I just came back from a meeting with a former lecturer. She helped me out a bit, and its only now I can gradually see how wrong I went wrong about it.
I told her, they should introduce a new unit on campus, "Introduction to Database Importing" :)
She gave me a book called "Database Systems" by Rob Coronel to keep :)
Anyway, I'll see how this goes. Thanks for your input Dave.
Cheers
Mark

Mark
Monday, May 19, 2003

David Roper, 
Thankyou, What you say is very plausible, as I was thinking last night the need to standardise my approach in what I do, in whatever I do. I can't allow a repeat of this mess in future.
But I also have a good book now that will aid me in the correct db direction.
Mark

Mark
Monday, May 19, 2003

The suggestions are good and they all work, but only if you know what they mean, and at the moment you don't.  You're being asked to do something for which you don't have the knowledge or competency and no amount of help on a forum is going to provide that.

For someone with the knowledge and experience it is a trivial task (what might not be trivial is the reason its in the mess it is in the first place and I wouldn't underestimate that).  In this case I would treat it as a problem to be outsourced.  There are any number of people around that can do this, though this forum isn't one which should be used either for touting for business nor searching for people to do the work.

And its not your call to look for quotes or organise the contract of course.  You know best your own situation and if you feel you're in a hole and that your job is on the line then I think everyone can sympathise, but how about approaching the manager with the solution that someone else handles it and you get on with the job you're supposed to do?

Simon Lucy
Monday, May 19, 2003

I agree completely with Simon.  Unless there's some reason for you to be learning how to use MS Access -- and probably even then -- it's not worth the time for you to figure out how to do it yourself right now. 

This sounds like a job that could be done quickly by someone who knows what they're doing.  If you persevere you'll be able to do it and you'll learn a lot in the process. 

But you may take 5, 10, 20, 30 or more times as much time as someone who already knows how to do it.  And the end product from someone else is also likely to be better.  Outsourcing seems like a good choice to me.

Herbert Sitz
Monday, May 19, 2003

Simon and Herbert,
                              As far as I can tell  the problem isn't one peculiar to Access but a  more general one of normalization. Mark mentions that he did the courses but is a little rusty.

                              The reason outsourcing will possibly not be the answer is that to decide on the design of a relational database you need to know what the data means, and this is a domain problem. Mark works for the company so he is probably in a much better situation to find out what the relationship between each entry in his field is than any outside contractor. He knows what a valve instrument is; we don't.

Stephen Jones
Monday, May 19, 2003

If Mark does have some general database knowledge, that just means the issue has even more to do with Access, since that's generally the easiest tool to do a data import into.  (Once data is imported and properly normalized in the .mdb you can shoot it anywhere you want.)

And that might be all the more reason to get some outside help.  Mark could work side-by-side with an Access expert, perfect the import process, and devise the most cost-effective system to import the many spreadsheets he's dealing with.  Then Mark can take over and do the remaining imports on his own.

It's hard to tell without seeing the project and without being able to gauge Mark's level of competence with MS Access and with databases in general, but I still think getting some amount of help from an Access expert would lead to the most cost-effective solution, even if it doesn't involve turning the entire project over to them.

Herbert Sitz
Monday, May 19, 2003

He's already got the data imported. The problem is what to do with it now.

Pascal  made the point clear talking about XML There is no point in having a data transfer language until you have agreed on a common schema for organizing the data. And how you organize that data depends on what the data is and how it is going to be used.

And this is a question of domain knowledge and general database design. What I, and I thiink Albert, are suggesting is that he needs to design his database from scratch, and completely forget about the ordering he had from Excel. Only when he has the design right should he start thinking about getting the data in the right place.

Stephen Jones
Monday, May 19, 2003

Ummm that's my point.  Its not broken because the data is the way it is but because the data hasn't been modelled in the first place.  Data modelling happens to be one of my things, and doing it with people that understand their own knowledge domain.

I'm not at all unique in this.  Its not an Access problem per se but as Albert said in the first place a modelling one.

Simon Lucy
Monday, May 19, 2003

Oh, sorry, didn't realize the data was already imported. 

If so, then I'd suggest reimporting and having the description field automatically separated into separate fields by the import process.  That's easier than doing it programmatically.  From the look of it, the field can just be separated as a set of 'comma separated value' in the import.

After they're split out, it's just a matter of normalizing, as you say. 

Herbert Sitz
Monday, May 19, 2003

But I still think the best solution is for Mark to work in tandem with an Access expert.  Mark can be the domain expert while the Access guy does the data modelling and massages the data into whatever data model he comes up with.

Herbert Sitz
Monday, May 19, 2003

As Mark said, they are actually trying to hire a db guy, but he's holiding the fort until the guy arrives.

As we haven't got the data in front of us I suppose we will have to wait until Mark decides whether the modelling is going to be easy or not.

There is a normalization tool in Access that will split tables, but I reckon that to use it properly you have to know enough not to need it in the first place.

Stephen Jones
Tuesday, May 20, 2003

Stephen -- Sorry, I missed that. 

Well, given that Mark also says that he's working for peanuts (i.e., it's cost effective for his employer) and that he's mostly after the experience (i.e., he doesn't care about the money or extra work), I'd say go for it and plug away.

Albert and Dave B's posts give useful advice, and the MS whitepaper links in Albert's post should be helpful.

But what Simon says is still very true:  the suggestions work "only if you know what they mean".  This is a really interesting point.  Whenever you're learning anything new, you can read explanations and think you understand it.  And to some extent you do.  But as you progress you find that you misunderstood some key point in the initial explanation, didn't understand everything fully, and/or didn't understand the depth and significance of everything in it.    This is natural.  After some initial explanation, you "know just enough to be dangerous", as they say.  A deeper understanding will come only after you've worked through the problems on your own.

As for advice for Mark, I'd say to make sure he looks at Albert's MS links and uses some good database and MS Access reference texts.  You need a good Access reference because once you learn what you want to do regarding general database design and use, you need to know how to make Access do what you want. 

Then go to a better site than JoS for specific help with the database.  The MS newsgroups on Access are okay, but there are a lot of questions that go unanswered and I never got a very good community feel out of them.  A much better community, in my opinion, can be found at http://www.tek-tips.com.  It's an active site with help for lots of different software and the forums on MS Access are among their most active.  Mark's questions would be appropriate in the "MS Access - Other" forum.  Hopefully he could get started and find someone to "take him under their wing there." 

Nothing against JoS, but it's not generally the best site for getting specific help with software questions.

Herbert Sitz
Tuesday, May 20, 2003

For specific Access tips I find Woodys Office portal
http://www.wopr.com good.

The Access news group on http://peach.ease.lsoft.com/archives/access-l.html
provides useful tips, thougn they tend to be about the VBA rather than the design. Unfortunately there are Acess developers around who think that a relational database is some kind of computer dating agency!

As you've stated before what he has to do is to get the schema right. Once he has done that, then Access allows you to do other things quickly and easily. However the process of doing the original design is independent of the database engine you will be using. In this respect Access requires as much skill as Oracle or DB2.

Stephen Jones
Tuesday, May 20, 2003

Lol, I didn't know my question would lead to an intellectual discourse :)
I've managed to figure out the problem, and had someone guide me through the process. The db is is almost finished.
I tell you, there seems to be a big gap between the academic arena and the real world. Something the former will have to address soon.
Thankyou to those who gave constructive comments. I've learned more then just the answer I sought.

Mark
Wednesday, May 21, 2003

*  Recent Topics

*  Fog Creek Home