Dan Lyke wrote about the Perl regular expression for extracting XMP from Adobe files:
In the XML extractor, try replacing the grouping for $3, currently “(.*)”, with “(.*?)”. “*” and “+” are default greedy, so the rest of the match will match the last occurrence of that in the document. The “?” makes them non-greedy. Almost always people mean “.*?” rather than “.*”.
So that expression becomes:
m/id='W5M0MpCehiHzreSzNTczkc9d'\s* \ (bytes=')*([^']*)'?\?> \ (.*?)<\?xpacket end='([^']*)'\?>/sg
Note that I split the expression across three lines.