I built a blog several years ago to record my lessons-learned in programming, so although I built this blog to record my findings on computer security, I’m still in the business of programming, and am still finding a lot to say. That other blog is way behind, and I want to consolidate, so I’ll be talking about lessons in programming here as well.
One of my ambitions with programming is to become a regex master. Text parsing is one of the funnest things about programming. You may find that strange, but I know I am not alone.
Anyway, I was working on my XML Formatter and was inserting functionality to check for self-contained tags, such as <br />. I was getting very frustrated because my expression simply was not matching any self-contained tags, which were always getting the status of ‘text’ and getting their tag markings stripped out. (I have four classifications – open tags, closed tags, self-contained tags, and text)
I went through the gamut of looking online and finding that regex is not a great way to build an XML parser, but I was not building this to parse XML, I was building it to format XML, because my inner designer hates seeing poorly formatted XML and HTML. But maybe the people were right and I was going to find myself in a forest dark where the straightforward path had been lost. I think I put it off fixing it for a whole week.
Until this morning. After about an hour, I decided that something must be correct about my expression, so maybe the problem was further up.
Then I saw it.
breakdown = (re.findall(OPENSIG + ‘|’ + CLOSESIG + ‘|’ + ‘CONTSIG’ + ‘|’ + TEXTSIG, testString))
I had put CONTSIG in quotes. Making it a string literal. Causing the pattern to never match my test string.
Well, here’s my sign.
I’m hoping today to finish up the last piece, allowing for file input. But here’s to nearing the very end of a project!