"How did you work that out so quickly?" my friend asked earlier this week when I sent him a handful of additional ways to view the problem we'd been pairing on.
"Stack Overflow and lots of small, fast, cheap experiments," I replied.
And that's often true. It might not be Stack Overflow, but it's usually breaking a problem down, iterating on coherent chunks of it, composing the pieces into larger chunks, and building back to the original problem.
The approach is reliable for me, but asking if it will be quick is like asking how long the proverbial string is. Sometimes it takes longer and I need to iterate on the approach itself, finding other "seams" along which to break the problem apart into units that can be addressed independently.
Sometimes I just give up and wait for the next opportunity to learn. Whether this is the right thing depends on all sorts of factors including whether I need an answer, whether I need the answer, whether I need an answer now, what else I could do instead, whether I can think of alternative things to try, how well I feel I understand the problem and the space I'm working in, whether I think I could ask for the answer, whether I've already learned something, and so on.
--00--
In case you're wondering, the problem was to extract a particular field from a reasonable number of records in a JSON file and check that all instances were eight characters long.
He'd been trying for a while but was stuck so we paired for ten minutes and, as this was a one-off task, and the data wasn't exceptionally large, we arrived at a cheap answer using jq and simply eyeballing the output:
$ jq -r '.structure[].field' data.json
abcdefgh
12345678
ABCDEFGH
...
He'd been trying to build a more sophisticated approach using jq to count the number of characters in the field, but had got bogged down in documentation. As he needed to move on and we'd got a good enough approach, I dropped off the call.
But, for my own learning, I took a few minutes looking for a more sophisticated approach. To begin with I made a small test data file that mimicked the important elements of his file. Self-describing data helps me to interpret the output of my tests:
{
"types": [
{
"type": "20.................."
},
{
"type": "5...."
},
{
"type": "9........"
}
]
}
I thought I could probably do something quick and dirty at the command line to reduce the risk of missing something in the output data. I searched a little, tried a couple of things using awk, and arrived at this:
$ jq -r '.types[].type' file.json | awk '{print length(), $0}' | sort -n
5 5....
9 9........
20 20..................
Which is fine but, inspired by my friend, I wondered whether I could do the thing in jq alone.
I searched for "jq find length of string" and came up with a Stack Overflow answer which gives a clue that there is a length function that could be helpful.
Unfortunately, the answer is not quite what he was after. Simply bodging it onto the end of the call to jq fails and, because it uses commands and syntax I don't know, I couldn't quickly see why.
The jq documentation confirmed that the length function existed in the version of jq that I was using and gave simpler examples. I find technical doc, particularly abstract syntax, really hard to get my head around even after years of trying. I think that jq does a great job of giving the syntax, some description, and worked examples.
Despite the quality of the doc, in my experience searching more generally first helps me to find things that I would not even know to search for, and then experimenting on a toy example helps to cement my understanding. It's like working top-down and bottom-up at the same time, repeatedly comparing the map and the territory.
So what I did next was try variant uses of the command on the end of my original jq call, until one of them looked promising, the simplest in this case:
$ jq '.types[].type | length ' file.json
20
5
9
That seemed reasonable and so I tried to build it back up to something closer to the original problem:
$ jq '.types[].type | select (length == 5 ) ' file.json
"5...."
Bingo! Now I wanted to reverse the logic because the original task was to check that there were no entries that were not eight characters long. That one was a straightforward change, although I noticed that my test data didn't include an eight-character entry:
$ jq '.types[].type | select (length != 9 ) ' file.json
"5...."
"20.................."
Comments
Post a Comment