Fog Creek Software
Discussion Board




removing delimters like ./ in Java in between word

I know there is a stringtokenizer class from removing all
delimters, but  is there a way to remove delimeters isolated from words, and not in between words,
like for

www.abc.com ... , me  .and ..

it would give

www.abc.com me .and

Sergei
Tuesday, July 22, 2003

Typically you would end up using nested tokenizers, or using a simple whitespace delimiter to isolate character blocks before analysing individual tokens to see if they are made exclusively from your second set of delimiters.

Richard
Tuesday, July 22, 2003

One word: Regular expressions.

Ok, "regexp" to make it just one word :).

T. Norman
Tuesday, July 22, 2003

Take attention, Java API Tokenizer will not return 'empty' tokens, for example, this

    abc|qwerty|123

will return three tokens 'abc', 'qwerty' and '123', and this

    abc||123

will return two tokens 'abc and '123'. It may result in application errors in some specific cases, for example, if your client application talks to server using a your own protocol, which have a fixed number/sequence of tokens...

Evgeny /Javadesk/
Tuesday, July 22, 2003

Humm

    StringTokenizer st = new StringTokenizer("abc||123", "|", true);
    while (st.hasMoreTokens()) {
     System.out.println("Token - " + st.nextToken());
    }

gives

Token - abc
Token - |
Token - |
Token - 123


So you can get back the delimiters with a StringTokenizer. The StreamTokenizer doesn't make a distinction between delimiters and other sorts of tokens, it just tokenizes and it's up to you to determine what the meaning of each token is.  This could form the basis of a solution.

For the really complete solution JavaCC http://www.experimentalstuff.com/Technologies/JavaCC/index.html makes it really easy to build parsers. If you're going to have to do a lot of nested tokenizers you might look at this instead. It's very easy to use.

Alex Moffat
Tuesday, July 22, 2003

If a "replace all substrings in a string with another string" will do the trick for you, here is a (non optimized) sample.

http://javaalmanac.com/egs/java.lang/ReplaceString.html

This is more suited for JDK 1.3 and previous. If you are using JDK 1.4 it is better to use Regex (which comes with the JDK).

In 1.4, String comes with a replaceAll() method.

A.F.

Avrom Finkelstein
Tuesday, July 22, 2003

I think that Regex may add an overhead, especially if performed too often/concurrently.

Evgeny /Javadesk/
Wednesday, July 23, 2003

*  Recent Topics

*  Fog Creek Home