I'm toying with the simple function which should extract parts of the string between brackets, i.e. HTML tag names, to the nested array of strings:
Code: Select all
str←'hel<o><worl>d'
DISPLAY parse_brackets str '<' '>'
┌→───────────┐
│ ┌→┐ ┌→───┐ │
│ │o│ │worl│ │
│ └─┘ └────┘ │
└∊───────────┘
My (naive) implementation of this parse_brackets is the following:
R←parse_brackets(str br1 br2);open;close
⍝ APL2 compatibility to test with GNU APL
⎕ML←3
⍝ Indexes of the open bracket br1
open←(str=br1)/⍳⍴str
⍝ Indexes of the close bracket br2 - 1
⍝ ¯1 to exclude closing bracket
close←¯1+(str=br2)/⍳⍴str
⍝ construct a matrix with start in the first
⍝ line and lengths of extracted words in the
⍝ second line;
⍝ split it to vertical blocks;
⍝ for each pair (begin of the string; length)
⍝ drop up to begin and take the length
R←{⍵[2]↑⍵[1]↓str}¨⊂[1]open,[0.5]close-open
But I feel this implementation is rather clumsy and there must be at least several more elegant ways to do it. Any criticism and ideas on how to implement this in more APLish way(here I feel I still doing the 'functional' way of programming, i.e. transform data to the list and apply lambda function to each element of this list).