boost for
(cat|girl|gay|boob|shark|emoj(i|o)|gargr?(on|amel))s?
hello i'm one of the seven (7) people in the world who apparently enjoys writing regular expressions enough to shitpost about it
@CobaltVelvet
I enjoy writing regex engines, does that count?
@kellerfuchs @CobaltVelvet I enjoy complaining about how much better regular expressions could be if they were composable, ie http://synthcode.com/scheme/irregex/ (see also http://www.more-magic.net/posts/lispy-dsl-sre.html and for why regexes aren't composable (read "stringly typed systems") see http://groups.csail.mit.edu/mac/users/gjs/6.945/psets/ps01/ )
@cwebber @kellerfuchs @CobaltVelvet Shout-out to probably my favorite published paper ever, "A Play on Regular Expressions": https://sebfisch.github.io/haskell-regexp/
I used that library once to crack the cipher state of badly-encrypted files where the plaintext had a nice regular structure. Each observed byte constrained the possible cipher states a little more, in ways that fit the semiring construction nicely.
@jamey @kellerfuchs @CobaltVelvet holy shit this post has everything
@jamey @CobaltVelvet
Yeah, it's a very nice paper.
IIRC, it's however pretty hard to implement match groups with that approach (and it seems the Haskell library doesn't?), so that's likely a bust for @cwebber
@kellerfuchs @CobaltVelvet @cwebber Yes, you're correct, that library doesn't make it easy to extract parts of the match. I seem to recall that I thought through how to do it, but didn't actually try it because it's hard, and now I don't even remember how it would work.
But there's a paper from the next year's ICFP that extends regex derivatives to context-free parser combinators, so cleverer people than me have that covered!
@jamey @CobaltVelvet @cwebber
IIRC, they already had context-free stuff there, using coinductive (i.e., potentially infinite) structures for regexps.
@kellerfuchs @CobaltVelvet @cwebber Yes, that's correct, but the later paper makes it relatively efficient and, of immediate relevance to this discussion, extends it to parser combinators which can extract a parse tree rather than just recognizing whether an input matches or not.
@jamey @CobaltVelvet @cwebber
*nods.*
If you happen to have a ref. to the paper, that would be great; otherwise, I'll try and remember to hunt it down.
@jamey @kellerfuchs @CobaltVelvet @cwebber I think you actually want https://arxiv.org/pdf/1604.04695.pdf
@cwebber @CobaltVelvet @kellerfuchs @jamey at least, that's the one that gives the good runtime