[3688] in BarnOwl Developers

home help back first fref pref prev next nref lref last post

Re: [barnowl] Perl logging (#54)

daemon@ATHENA.MIT.EDU (Anders Kaseorg)
Sat Jan 4 05:58:16 2014

Date: Sat, 04 Jan 2014 02:58:12 -0800
From: Anders Kaseorg <notifications@github.com>
Reply-To: barnowl/barnowl <reply+p-8651443-f5a8d319eb2549d4459db8618311ee3c77a23d36-4475081@reply.github.com>
To: barnowl/barnowl <barnowl@noreply.github.com>
In-Reply-To: <barnowl/barnowl/pull/54@github.com>


----==_mimepart_52c7e944182da_448cb3dd001333e8
Content-Type: text/plain;
 charset=UTF-8
Content-Transfer-Encoding: quoted-printable

> +}
> +
> +=3Dhead2 sanitize_filename BASE_PATH FILENAME
> +
> +Sanitizes C<FILENAME> and concatenates it with C<BASE_PATH>.
> +
> +In any filename, the characters C<"/">, C<"~">, and anything before
> +C<"!"> get replaced by underscores.  If the resulting filename is
> +empty or equal to C<"."> or C<"..">, it is replaced with C<"weird">.
> +
> +=3Dcut
> +
> +sub sanitize_filename {
> +    my $base_path =3D BarnOwl::Internal::makepath(shift);
> +    my $filename =3D shift;
> +    $filename =3D~ s/[\/~\0- ]/_/g;

We are humans. Being humans, we don=E2=80=99t care how the numbers repres=
enting characters compare to each other. We care about the properties of =
characters. We should just strip out `[[:cntrl:]]` and leave it at that, =
like we do in other places.

(I don=E2=80=99t see a good reason to strip out spaces, or to start tryin=
g to decide which punctuation characters are okay. And a blacklist seems =
better than a whitelist because Unicode gets extended.)

For the curious, the Venn diagram of `cntrl`, `blank`, `space`, `print`, =
`graph` looks like this under [`use feature 'unicode_strings'`](http://pe=
rldoc.perl.org/perlunicode.html#The-%22Unicode-Bug%22):
```
!cntrl !blank !space !print !graph: "\x{378}\x{379}\x{37f}\x{380}\x{381}\=
x{382}=E2=80=A6"
!cntrl !blank !space  print  graph: "!\"#\$%&'()*+,-./0123456789:;<=3D>?\=
@ABCDEFG=E2=80=A6"
!cntrl !blank  space !print !graph: "\x{2028}\x{2029}"
!cntrl  blank  space  print !graph: " \x{a0}\x{1680}\x{180e}\x{2000}\x{20=
01}\x{2002}\x{2003}\x{2004}\x{2005}\x{2006}\x{2007}\x{2008}\x{2009}\x{200=
a}\x{202f}\x{205f}\x{3000}"
 cntrl !blank !space !print !graph: "\0\1\2\3\4\5\6\a\b\16\17\20\21\22\23=
\24\25\26\27\30\31\32\e\34\35\36\37\177\200\201\202\203\204\206\207\210\2=
11\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\23=
3\234\235\236\237"
 cntrl !blank  space !print !graph: "\n\13\f\r\205"
 cntrl  blank  space !print !graph: "\t"
```

---
Reply to this email directly or view it on GitHub:
https://github.com/barnowl/barnowl/pull/54/files#r8651443=

----==_mimepart_52c7e944182da_448cb3dd001333e8
Content-Type: text/html;
 charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<p>In perl/lib/BarnOwl/Logging.pm:</p>
<pre style=3D'color:#555'>&gt; +}
&gt; +
&gt; +=3Dhead2 sanitize_filename BASE_PATH FILENAME
&gt; +
&gt; +Sanitizes C&lt;FILENAME&gt; and concatenates it with C&lt;BASE_PATH=
&gt;.
&gt; +
&gt; +In any filename, the characters C&lt;&quot;/&quot;&gt;, C&lt;&quot;=
~&quot;&gt;, and anything before
&gt; +C&lt;&quot;!&quot;&gt; get replaced by underscores.  If the resulti=
ng filename is
&gt; +empty or equal to C&lt;&quot;.&quot;&gt; or C&lt;&quot;..&quot;&gt;=
, it is replaced with C&lt;&quot;weird&quot;&gt;.
&gt; +
&gt; +=3Dcut
&gt; +
&gt; +sub sanitize_filename {
&gt; +    my $base_path =3D BarnOwl::Internal::makepath(shift);
&gt; +    my $filename =3D shift;
&gt; +    $filename =3D~ s/[\/~\0- ]/_/g;
</pre>
<p>We are humans. Being humans, we don=E2=80=99t care how the numbers rep=
resenting characters compare to each other. We care about the properties =
of characters. We should just strip out <code>[[:cntrl:]]</code> and leav=
e it at that, like we do in other places.</p>

<p>(I don=E2=80=99t see a good reason to strip out spaces, or to start tr=
ying to decide which punctuation characters are okay. And a blacklist see=
ms better than a whitelist because Unicode gets extended.)</p>

<p>For the curious, the Venn diagram of <code>cntrl</code>, <code>blank</=
code>, <code>space</code>, <code>print</code>, <code>graph</code> looks l=
ike this under <a href=3D"http://perldoc.perl.org/perlunicode.html#The-%2=
2Unicode-Bug%22"><code>use feature 'unicode_strings'</code></a>:</p>

<pre><code>!cntrl !blank !space !print !graph: "\x{378}\x{379}\x{37f}\x{3=
80}\x{381}\x{382}=E2=80=A6"
!cntrl !blank !space  print  graph: "!\"#\$%&amp;'()*+,-./0123456789:;&lt=
;=3D&gt;?\@ABCDEFG=E2=80=A6"
!cntrl !blank  space !print !graph: "\x{2028}\x{2029}"
!cntrl  blank  space  print !graph: " \x{a0}\x{1680}\x{180e}\x{2000}\x{20=
01}\x{2002}\x{2003}\x{2004}\x{2005}\x{2006}\x{2007}\x{2008}\x{2009}\x{200=
a}\x{202f}\x{205f}\x{3000}"
 cntrl !blank !space !print !graph: "\0\1\2\3\4\5\6\a\b\16\17\20\21\22\23=
\24\25\26\27\30\31\32\e\34\35\36\37\177\200\201\202\203\204\206\207\210\2=
11\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\23=
3\234\235\236\237"
 cntrl !blank  space !print !graph: "\n\13\f\r\205"
 cntrl  blank  space !print !graph: "\t"
</code></pre>

<p style=3D"font-size:small;-webkit-text-size-adjust:none;color:#666;">&m=
dash;<br>Reply to this email directly or <a href=3D'https://github.com/ba=
rnowl/barnowl/pull/54/files#r8651443'>view it on GitHub</a>.<img src=3D'h=
ttps://github.com/notifications/beacon/4475081__eyJzY29wZSI6Ik5ld3NpZXM6Q=
mVhY29uIiwiZXhwaXJlcyI6MTcwNDM2NTg5MiwiZGF0YSI6eyJpZCI6MjI3OTE3NzN9fQ=3D=3D=
--c4d22d811e1c325abbefe065d910d88a32b1f5fe.gif' height=3D'1' width=3D'1'>=
</p>=

----==_mimepart_52c7e944182da_448cb3dd001333e8--

home help back first fref pref prev next nref lref last post