I wrote a strrstr()

Just because.

https://stackoverflow.com/a/78527348/2706707

If I made a major fail I’m sure the nice folks at SO will let me know, so, feel free to roast if I did a stoopid — my feelings will already be hurt. 🙃
I was tasked a few years back to write one of these that found near matches. Guy was just off the cuff throwing out values ... like 90% match, etc ... goal was to find things like person's name if it were spelled wrong in a system, after which other logic would match other things to see if same entity or not, all I had to do was the simple first cut.

ended up doing a variety of things and one that worked astonishingly well for real english names and words was to sort the words and deduplicate them, so that 'bubba' is 'abu' and it would match bub, buba, and well, other stuff that was just wrong like abu :). I think we allowed 1 + 10% of the letter count to be missing, eg bubbax would match because its checking abu vs abux and only 1 letter off. Not general purpose, of course, but that was the first test of 5 or 6. If it failed that, the rest of the tests were skipped. It wasn't perfect but if you were trying to find someone and couldn't spell their name, it usually found your guy when we were done with it. I dunno what you could try that is both general purpose and faster... but if you get really, really bored try doing fuzzy matching...

I didn't see anything wrong with what you did. More code than I expected, but it makes sense to run it various ways.
Last edited on
My c is a bit rusty but consider which doesn't use any c string function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#include <stdio.h>

// Reverse string find. Finds fnd within st starting from the end of st
// Returns a pointer within st of the last occurrence of fnd or NULL if not found
const char* strrstr(const char* st, const char* fnd) {
	const char* stbeg = st;
	const char* fndbeg = fnd;

	for (; *st; ++st);
	for (; *fnd; ++fnd);

	const size_t fndlen = fnd - fndbeg;

	if (st - stbeg < fndlen)
		return NULL;

	st -= fndlen;

	for (int notgot; notgot = 0, st >= stbeg; st -= notgot) {
		for (const char *itr = st, *fit = fndbeg; fit != fnd; ++fit, ++itr)
			if (notgot = (*itr != *fit))
				break;

		if (!notgot)
			return st;
	}

	return NULL;
}

int main() {
	const char* tst = "thist is a test string that tests thisy";
	const char* fnd1 = strrstr(tst, "test");
	const char* fnd2 = strrstr(tst, "thisy");
	const char* fnd3 = strrstr(tst, "thist");
	const char* fnd4 = strrstr(tst, "thiss");

	printf("%lli\n", fnd1 ? fnd1 - tst : -1);
	printf("%lli\n", fnd2 ? fnd2 - tst : -1);
	printf("%lli\n", fnd3 ? fnd3 - tst : -1);
	printf("%lli\n", fnd4 ? fnd4 - tst : -1);
}


1
2
3
4
28
34
0
-1

Last edited on
@seeplus
Alas, your function doesn’t pass the litmus tests.

haystack, needle, (result - haystack) or -1 for NULL
  { "", "", 0 },
  { "", "x", -1 },
  { "x", "", 0 }, // your function returned offset = 1
  { "x", "x", 0 },
  { "xy", "x", 0 },
  { "xy", "y", 1 },
  { "xyx", "x", 2 },
  { "xyx", "y", 1 },
  { "xyx", "z", -1 },
  { "xyx", "", 0 }, // your function returned offset = 3 

If you fix it I’ll test it.


@jonnin
My code is only about 30 lines (including commentary), plus another 25 for the header. See the code-blocks at the bottom of the post.

The stuff in the first block of code is a test harness (written by another poster and modified by me) which includes code supplied by all the other people who tried to write a strrstr. That will naturally be heavy.


The closest I ever got to writing non-exact matching was playing with a soundex algorithm I wrote for someone (I don't even remember why I had to do this, or even if it was something I got paid for or just volunteered, lol).


Thanks for looking!
I enjoy messing with problems like this one.
Last edited on
doesn’t pass the litmus tests


Ah. The issue is with an empty needle. Yep, didn't test for that. It's always the edge cases that bite you!

What's the empty needle rule - as from the above the result is always the beginning of the haystack?

Last edited on
Assuming that an empty needle returns the address of the beginning of the haystack then:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// Reverse string find. Finds fnd within st starting from the end of st
// Returns a pointer within st of the last occurrence of fnd or NULL if not found
const char* strrstr(const char* st, const char* fnd) {
	if (!*fnd)
		return st;

	const char* stbeg = st;
	const char* fndbeg = fnd;

	for (; *st; ++st);
	for (; *fnd; ++fnd);

	const size_t fndlen = fnd - fndbeg;

	if (st - stbeg < fndlen)
		return NULL;

	st -= fndlen;

	for (int notgot; notgot = 0, st >= stbeg; st -= notgot) {
		for (const char *itr = st, *fit = fndbeg; fit != fnd; ++fit, ++itr)
			if (notgot = (*itr != *fit))
				break;

		if (!notgot)
			return st;
	}

	return NULL;
}

int main() {
	const char* tst = "thist is a test string that tests thisy";
	const char* fnd1 = strrstr(tst, "test");
	const char* fnd2 = strrstr(tst, "thisy");
	const char* fnd3 = strrstr(tst, "thist");
	const char* fnd4 = strrstr(tst, "thiss");
	const char* fnd5 = strrstr(tst, "");

	printf("%lli\n", fnd1 ? fnd1 - tst : -1);
	printf("%lli\n", fnd2 ? fnd2 - tst : -1);
	printf("%lli\n", fnd3 ? fnd3 - tst : -1);
	printf("%lli\n", fnd4 ? fnd4 - tst : -1);
	printf("%lli\n", fnd5 ? fnd5 - tst : -1);
}


1
2
3
4
5
28
34
0
-1
0

Last edited on
There's also this trivial implementation that uses strstr:

1
2
3
4
5
6
7
8
9
const char* strrstr(const char* st, const char* fnd) {
	if (!*fnd)
		return st;

	const char* prev = NULL;

	for (const char* ret; (ret = strstr(st, fnd)) != NULL; prev = ret, st = ret + 1);
	return prev;
}


But this should be slower as it always traverses the whole haystack going forward whereas the other way traverses backwards only until a match is found.
Last edited on
I agree that returning the first character in the string for an empty search string is stupid. I actually had to mend my own algorithm to do that. But.... it is what everyone else apparently believes to be the correct behavior, lol. (But more than that, that is what the test harness expected, and I wasn’t going to modify everyone else’s algorithms to “fix” it.)

Sadly, your algorithm rates poorly.
averages:
#1 0.812 us last_strstr
#2 0.825 us strrstr
#3 1.404 us theo
#4 1.417 us cordelia
#5 2.176 us backwards_memcmp
#6 2.581 us seeplus
#7 2.946 us digitalross
#8 723.103 us sinan


The trivial version using strstr() was already considered (it is called “last_strstr()” in the test suite). It regularly rates first or second in the tests.

These days if anyone were to ask me how to “strrstr“... my answer would be to just use “last strstr”.

I do have one more idea that might be interesting, but it isn’t O(1). EDIT — it was a horrible idea.

Latest test. This is more like I expected.
averages:
#1 0.770 us strrstr
#2 0.813 us last_strstr
#3 1.481 us theo
#4 1.539 us cordelia
#5 2.116 us seeplus
#6 2.121 us backwards_memcmp
#7 2.674 us stacked
#8 2.945 us digitalross
#9 714.874 us sinan
Last edited on
Sadly, your algorithm rates poorly.


OK. I've changed the for loops. This may (hopefully) improve the performance.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
const char* strrstr(const char* st, const char* fnd) {
	if (!*fnd)
		return st;

	const char* stbeg = st;
	const char* fndbeg = fnd;

	for (; *st; ++st);
	for (; *fnd; ++fnd);

	const size_t fndlen = fnd - fndbeg;

	if (st - stbeg < fndlen)
		return NULL;

	st -= fndlen;

	for (int notgot = 1; st >= stbeg; --st) {
		for (const char* itr = st, *fit = fndbeg; fit != fnd && !(notgot = (*itr++ != *fit++)); );

		if (!notgot)
			return st;
	}

	return NULL;
}

hmm. there is extra logic because of fitting things, but it could be faster to do 64 bit int comparisons (8 times faster!) if you can spare the extra stuff to make that work, instead of 1 byte at a time. It would likely run slower for "is x in "oxo" but it would tear it up on large input.
Alas, no significant improvement.
averages:
#1 0.766 us strrstr
#2 0.817 us last_strstr
#3 1.520 us theo
#4 1.554 us cordelia
#5 2.087 us seeplus
#6 2.111 us backwards_memcmp
#7 2.678 us stacked
#8 2.942 us digitalross
#9 717.930 us sinan

You know, the entire test program is there in my posting on SO. You can download it and add your own version and compile and play all you like. 🙂

@jonnin
For short strings (< a few hundred bytes minimum) it isn’t worth trying anything fancier than a `strstr()`. For longer strings, yeah, that could be an interesting idea. For a four-byte integer you’d only need four compares per aligned integer, plus care taken at the unaligned ends. This is basically how (good) implementations of things like `strstr()` work underneath, though.
Last edited on
Alas, no significant improvement


Probably because the algorithm works backwards through the haystack but the timing issue is first finding the end of the haystack which necessitates iterating through the haystack to find its end (same with the needle).

It's an interesting c programming exercise.

the entire test program is there in my posting on SO


Missed it as I just skimmed SO. :)


PS.
OK. Got the test program. Thanks. Can now play!
Last edited on
I've hacked my trivial implementation from above (now seeplus1). My backwards implementation is now seeplus2. seeplus1 has now the fastest average and takes pole position for averages! (MS VS2022)

1
2
3
4
5
6
7
8
9
10
averages (64 bit):
#1 0.624 us seeplus1
#2 0.708 us last_strstr
#3 1.984 us strrstr
#4 5.569 us cordelia
#5 6.449 us theo
#6 8.546 us digitalross
#7 8.824 us seeplus2
#8 9.601 us backwards_memcmp
#9 625.747 us sinan 


1
2
3
4
5
6
7
8
9
10
char* seeplus1(const char* st, const char* fnd) {
    if (!*fnd)
        return (char*)st;

    const char* prev = NULL;

    for (const char* ret = NULL, *fnd1 = fnd + 1; ((ret = strchr(st, *fnd))) && (*fnd1 && ((ret = strstr(ret + 1, fnd1))) || !*fnd1); st = (prev = ret) + 1);

    return (char*)prev;
}


PS Updated so that strstr starts from the second char (if present) as first char is checked by strchr
Last edited on
For info. The timings for the top 3 for each round are (64 bit):

1
2
3
4
5
6
7
8
9
round 1 seeplus1(0.356)    last_strstr(0.613) strrstr(3.560)
round 2 last_strstr(0.670) seeplus1(0.683)    strrstr(3.600)
round 3 last_strstr(0.794) seeplus1(0.809)    strrstr(3.620)
round 4 seeplus1(0.369)    last_strstr(0.648) strrstr(1.950)
round 5 last_strstr(0.667) seeplus1(0.679)    strrstr(1.972)
round 6 last_strstr(0.788) seeplus1(0.794)    strrstr(2.000)
round 7 strrstr(0.288)     seeplus1(0.341)    last_strstr(0.610)
round 8 strrstr(0.292)     last_strstr(0.666) seeplus1(0.673)
round 9 strrstr(0.289)     last_strstr(0.787) seeplus1(0.798)

Last edited on
It is very impressive!

I haven’t had a moment to mess with it, but as I review the work it makes me wonder. My `rstrstr()` works so quickly on the last because it is careful to skip impossibilities on both ends of the haystack. The `last_strstr()` works because `strstr()` really is about as linear as you can get.

I am unsure how using `strchr()` enhances the speed of the search, but it does, and handily. I think I have a new favorite!

Nice!

P.S. I am unsure how my `rstrstr()` is performing so poorly for you. I’m using Clang 14.0 on x86_64 with -O3.

P.P.S. If I have your permission, I will update the SO posting with your `seeplus1()` version and hook it up as the best solution as an alias with `strrstr()`.
Last edited on
I use MS VS2022 with optimisation for fast code.

Please use any of my posted code as you wish.
Last edited on
it could be faster to do 64 bit int comparisons (8 times faster!)


For 64 bit compile, the crt strstr() tries to do 16 byte comparisons at a time and also tries to use SSE/SSE2 instructions if available.

For i386 strstr() is coded as assembler.

That's why strstr() is so fast compared to other 'simple' versions that don't have these optimisations/nor assembler.

Note that the VS crt source code is at:
c:\Program Files (x86)\Microsoft Visual Studio xxx\VC\crt\src
where xxx is the VS version number
Last edited on
If compiled for 32 bit (as opposed to 64 bit above) then the average timings become:

1
2
3
4
5
6
7
8
9
10
averages (32 bit):
#1 0.650 us seeplus1
#2 0.699 us last_strstr
#3 2.209 us strrstr
#4 4.835 us cordelia
#5 4.915 us backwards_memcmp
#6 5.898 us theo
#7 8.633 us seeplus2
#8 11.048 us digitalross
#9 645.707 us sinan 


This is interesting especially backwards_memcmp where the timing reduces from 9.601 (64bit) to 4.915 (32 bit).
Last edited on
Topic archived. No new replies allowed.