[28508] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 9872 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Oct 20 14:10:20 2006

Date: Fri, 20 Oct 2006 11:10:08 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Fri, 20 Oct 2006     Volume: 10 Number: 9872

Today's topics:
    Re: Parsing and regex <nobull67@gmail.com>
        perlembed - how to get 'use constant' values <peace.is.our.profession@gmx.de>
    Re: Regex exactly 0, 1 or 2 matches, {0,2} not working <rvtol+news@isolution.nl>
    Re: Scripting an EXE <bik.mido@tiscalinet.it>
    Re: Scripting an EXE <bwilkins@gmail.com>
    Re: Tie, dbmopen and hashes <ced@blv-sam-01.ca.boeing.com>
    Re: Tie, dbmopen and hashes <january.weiner@gmail.com>
    Re: Tie, dbmopen and hashes <mark.clementsREMOVETHIS@wanadoo.fr>
    Re: Tie, dbmopen and hashes <nobull67@gmail.com>
    Re: Tie, dbmopen and hashes xhoster@gmail.com
    Re: Validating a file name <sicsicsic@freesurf.ch>
    Re: Validating a file name <sicsicsic@freesurf.ch>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 20 Oct 2006 10:11:50 -0700
From: "Brian McCauley" <nobull67@gmail.com>
Subject: Re: Parsing and regex
Message-Id: <1161364310.530957.99470@h48g2000cwc.googlegroups.com>



On Oct 20, 7:52 am, "Ferry Bolhar" <b...@adv.magwien.gv.at> wrote:
>
> I'd suggest to use the qr operator here. It allows you to write
> the regex as-is, but without string interpolation:

The qr// operator does do interpolation. (Unless you use the qr'' form)

my $foo = 'bar';

my $re = qr/(?:$foo)+/;

print 'xxxxbarbarxxxxx' =~ /($re)/; # prints barbar



------------------------------

Date: Fri, 20 Oct 2006 14:57:34 +0200
From: Mirco Wahab <peace.is.our.profession@gmx.de>
Subject: perlembed - how to get 'use constant' values
Message-Id: <ehahci$kqf$1@mlucom4.urz.uni-halle.de>

Hi,

in an embedded perl, I have no problems
accessing stuff from simple SV to  more
complex HV, like:

--- Perl ----
  ...
  our $r_q = 2.5;
  ...


--- C -------

  ...
  double r_q;
  SV *sv;
  if( (sv=get_sv("main::r_q", FALSE)) != 0 )
     r_q = SvNV( sv );                        // fine
  ...

But, what I didn't manage is to access
some "Variables" defined by the
"use constant" pragma.


--- Perl ----
   ...
   use constant R_Q => 2.5;
   ...


--- C -------
   ...
   if( (sv=get_sv("main::R_Q", FALSE)) != 0 )
      ...
   ...

In this case, sv will be set, but will
have:
   sv_any    - 0x00000000
   sv_flags  - 0

Any SvPV_nolen(sv) on this sv
will return the empty string: "\0".

Do you know where I'd find some infos
on this?

Regards & thanks

Mirco


------------------------------

Date: Fri, 20 Oct 2006 15:26:18 +0200
From: "Dr.Ruud" <rvtol+news@isolution.nl>
Subject: Re: Regex exactly 0, 1 or 2 matches, {0,2} not working
Message-Id: <ehapu5.1rs.1@news.isolution.nl>

GGB667 schreef:

> I want to match a number (required) 1-15 and an optional letter a-f.
> There are 0, 1 or 2 of these in the expression.  So blank is ok. 
> (Also "--" is ok - don't ask why).  These are seperated by ", " or at
> least "," or " ".

-- 8< -- 8< -- 8< -- 8< -- 8< --
#!/usr/bin/perl
  use strict ;
  use warnings ;

  { local $\ = "\n";

    my $n = qr/ (?: (?: [1-9] | 1[0-5] ) [a-f]? | -- )/x ; # nr
    my $s = qr/ , [[:blank:]]? | [[:blank:]] /x ;          # sep
    my $r = qr/^ (?: ($n) (?: $s ($n) )? )? $/x ;          # regex

    while (<DATA>)
    {
      if (/$r/)
      {
         print +($1 || '') . ($2 ? ";$2" : '') ;
      }
      else
      {
         print "Invalid: $_"
      }
      print '~' x40 ;
    }
  }

__DATA__
1
2a
15f
7a, 1
8b 3
1a,2b
2b, 3c

--
1a, 2b, 3c
-- 8< -- 8< -- 8< -- 8< -- 8< --


-- 
Affijn, Ruud

"Gewoon is een tijger."


------------------------------

Date: 20 Oct 2006 16:34:04 +0200
From: Michele Dondi <bik.mido@tiscalinet.it>
Subject: Re: Scripting an EXE
Message-Id: <mgnhj2dinmj5if5ijsmuv7mnd0hkaqhtuo@4ax.com>

On 20 Oct 2006 01:45:13 -0700, "JKG" <JKGambhir@gmail.com> wrote:

>Means after providing option 2, I have to provide option q also to quit
>from the EXE.
>
>I tried the following command as per your suggestion:
>C:\>echo 2 | t2t2.exe
>but then the EXE is running in the infinite loop reading option 2
>everytime. I just want this EXE to run once for option 2 and then quit

Really?!? How strange, one would expect it to sit there *waiting* for
an option. Are you *sure* it isn't so?!?


Michele
-- 
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
 .'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,


------------------------------

Date: 20 Oct 2006 09:54:57 -0700
From: "Brian  Wilkins" <bwilkins@gmail.com>
Subject: Re: Scripting an EXE
Message-Id: <1161363297.504096.6460@k70g2000cwa.googlegroups.com>


JKG wrote:
> Hi Thanks a lot for the help.
>
> > Why can't a simple "echo 2 | t2t2.exe" (or the DOS equivalent) do the w=
ork?
> Yes your suggestion it good and it worked to some point.
> Actually it is my mistake to explain the behaviour of the EXE properly.
> I am explaining it again:
>  ------------------------
> C:\>t2t2.exe
> Reading the temp files.
> Reading done.
> Writing the temp files.
> Writing done.
> Please enter option (1..9 or q)=3D> 2
> You entered 2.
> File processing done.
> Please enter option (1..9 or q)=3D>q
> Quit......bye
> C:\>
> ------------------------
> Means after providing option 2, I have to provide option q also to quit
> from the EXE.
>
> I tried the following command as per your suggestion:
> C:\>echo 2 | t2t2.exe
> but then the EXE is running in the infinite loop reading option 2
> everytime. I just want this EXE to run once for option 2 and then quit
> (reads option q).
>
> Please help....I think you are toooooooo close.
>
>
> Josef Moellers wrote:
> > JKG wrote:
> > > I have an EXE (say t2t2.exe), I do not have its source. It have the
> > > following behaviour:
> > > ------------------------
> > > C:\>t2t2.exe
> > > Reading the temp files.
> > > Reading done.
> > > Writing the temp files.
> > > Writing done.
> > > Please enter option (1..9)=3D> 2
> > > You entered 2.
> > > File processing done.
> > > C:\>
> > > ------------------------
> > > There are 9 options that can be passed. But I always passes 2 and I do
> > > it several times a day.
> > > I want to write a perl script for this. I think perl pipes will help.
> > > But I dont know how to start.
> >
> > Why can't a simple "echo 2 | t2t2.exe" (or the DOS equivalent) do the w=
ork?
> > If you need to parse some feedback, there's Perl/Expect, but I don't
> > know if that runs on your system.
> >
> > Just wanting to help,
> >
> > Josef
> > --
> > Josef M=F6llers (Pinguinpfleger bei FSC)
> > 	If failure had no penalty success would not be a prize
> > 						-- T.  Pratchett


You need Expect. Don't fool with echo. Read here:

http://tomacorp.com/perl/expect.html
http://search.cpan.org/~rgiersig/Expect/Expect.pod

You need cygwin to run it in Windows.



------------------------------

Date: Fri, 20 Oct 2006 13:32:55 GMT
From: Charles DeRykus <ced@blv-sam-01.ca.boeing.com>
Subject: Re: Tie, dbmopen and hashes
Message-Id: <J7FsAw.KqD@news.boeing.com>

January Weiner wrote:
> Hi all,
> 
> I have a large flatfile (we are talking gigabytes here) with records which
> I need to access randomly.  I have tried different approaches (e.g. using
> SQL), and what seems to work fastest is a simple dbmopen() with a hash that
> holds the respective keys (e.g. record IDs) and, as values, the positions
> in the file.  Here is a sample code used for indexing:
> 
> # ----------------------------------------------------
> #!/usr/bin/perl
> # db formatter
> 
> my $file = 'flatfile.txt' ;
> 
> my %pos_hash ;
> dbmopen( %pos_hash, $file . '.idx', 0666 ) ;
> 
> my $if ;
> my $tell_pos = 0 ;
> 
> open( $if, '<flatfile.txt') || die "Cannot open $file: $!\n" ;
> 
> while(<$if>) {
>   if( substr($_, 0, 3 ) eq 'REC' ) {
>     /REC\s*(\S+)/ ;              # records start with 'REC'
>     $pos_hash{$1} = $tell_pos ;
>   }
> 
>   $tell_pos = tell( $if ) ;
> }
> 
> close( $if ) ;
> dbmclose( %pos_hash ) ;
> 
> # ----------------------------------------------------
> 
> And here is a sample code used for accessing the records:
> 
> # ----------------------------------------------------
> #!/usr/bin/perl
> # db reader
> 
> my $file = 'flatfile.txt' ;
> my $record = 'id89781827376' ;
> 
> my %pos_hash ;
> dbmopen( %pos_hash, $file . '.idx', 0444 ) ;
> 
> my $if ;
> my $tell_pos = 0 ;
> 
> open( $if, '<flatfile.txt') || die "Cannot open $file: $!\n" ;
> 
> $tell_pos = $pos_hash{$record} ; # position of the record
> seek( $if, $tell_pos, 0 ) ;      # go to the specific position
> 
> my $line = <$if> ; # read  the first record field
> print $line      ; # print the first record field
> 
> while( <$if> ) {
>   last if( substr( $_, 0, 3 ) eq 'REC' ) ; # next record
>   print $_ ;
> }
> 
> close( $if ) ;
> dbmclose( %pos_hash ) ;
> 
> # ----------------------------------------------------
> 
> 
> I have two questions:
> 
> 1) I find that two programs cannot open the database at the same time. Is
>    this correct, even though I set the permissions to "read only" (0444)?  

You need to check errors and report what occurred, eg,
dbmopen(...) or die "dbmopen failed: $!"

(Best practice: start with  'use strict; use warnings' as well)


> 
> 2) At first, I wanted to use "tie". However, the documentation I read is
>    mostly about creating new classes, and I did not understand much from
>    it.  Is there a way of using "tie" which would behave as a simple
>    dbmopen, and if yes, (i) where can I read about it and (ii) what
>    advantages will it give me?
> 
I'd recommend Tie::File or one of Perl's DBM implemations such as
SDBM_File or DB_File. The latter is faster and more full-featured
but requires the external Berkeley DBM library.

( See comparison of Tie::File and DB_File implementations at 
http://perl.plover.com/TieFile/why-not-DB_File. )

However, Tie::File implements a record array. I'm not sure from your 
code if you need to fetch specific record(s) quickly. If so, a DBM
record hash would be preferable since the file is giga-size.

use strict; use warnings;
use DB_File;

tie my %hash,  'DB_File', $db_ame,  ... $DB_HASH
    or die "can't open $filename: $!";
open my $fh, '<', $file
    or die "can't open $file: $!";

 ...  # populate DBM hash with file's records

if ( exists $hash{ id89781827376 } )  {
    ...
}


-- 
Charles DeRykus


------------------------------

Date: Fri, 20 Oct 2006 15:48:03 +0200 (CEST)
From: January Weiner <january.weiner@gmail.com>
Subject: Re: Tie, dbmopen and hashes
Message-Id: <ehak2j$3qj$1@sagnix.uni-muenster.de>

Mark Clements <mark.clementsREMOVETHIS@wanadoo.fr> wrote:
> With that much data I'd argue you'd be better off injecting the data 
> into an RDBMS and working on it from there. If you don't have a suitable 
> RDBMS to hand and can't justify the overhead of installing one, even a 
> lightweight solution like SQLite would probably be preferable to 
> manipulating the file manually.

I have tried SQLite. The performance is not really better then my dbmopen
solution, injecting is much slower, and the code gets much more
complicated.

j.

-- 


------------------------------

Date: Fri, 20 Oct 2006 16:31:27 +0200
From: Mark Clements <mark.clementsREMOVETHIS@wanadoo.fr>
Subject: Re: Tie, dbmopen and hashes
Message-Id: <4538ddb6$0$25951$ba4acef3@news.orange.fr>

January Weiner wrote:
> Mark Clements <mark.clementsREMOVETHIS@wanadoo.fr> wrote:
>> With that much data I'd argue you'd be better off injecting the data 
>> into an RDBMS and working on it from there. If you don't have a suitable 
>> RDBMS to hand and can't justify the overhead of installing one, even a 
>> lightweight solution like SQLite would probably be preferable to 
>> manipulating the file manually.
> 
> I have tried SQLite. The performance is not really better then my dbmopen
> solution, injecting is much slower, and the code gets much more
> complicated.

Understood. A few thoughts, however:

Try dropping the indexes and turning off sync before injecting en masse,
recreate indexes and turn sync back on afterwards

Ensure there are indexes defined on query constraint columns.

The ins and out of DBI can be wrapped by eg Class::DBI (there are many 
other modules of this type on CPAN), which should simplify the code 
(this may add performance overhead).

My preference would be to use a solution of this type because I think 
the benefits outweigh the costs eg an RDBMS is designed for concurrent 
access, you get transactions (with some of them), the full power of SQL 
is available. If one isn't familiar with using an RDBMS then of course 
the costs seem that much higher.

Mark


------------------------------

Date: 20 Oct 2006 10:04:56 -0700
From: "Brian McCauley" <nobull67@gmail.com>
Subject: Re: Tie, dbmopen and hashes
Message-Id: <1161363896.501827.170890@i3g2000cwc.googlegroups.com>



On Oct 20, 11:16 am, January Weiner <january.wei...@gmail.com> wrote:

> my %pos_hash ;
> dbmopen( %pos_hash, $file . '.idx', 0444 ) ;

There's no need to declare the hash in a separate statement

dbmopen( my %pos_hash, $file . '.idx', 0444 ) ;

> 1) I find that two programs cannot open the database at the same time. Is
>    this correct, even though I set the permissions to "read only" (0444)?

That's because you are setting the permissions, not the open mode.

> 2) At first, I wanted to use "tie". However, the documentation I read is
>    mostly about creating new classes, and I did not understand much from
>    it.  Is there a way of using "tie" which would behave as a simple
>    dbmopen,

Yes, indeed IIRC

dbmopen( my %pos_hash, $file . '.idx', 0444 ) ;

Will be effectively implemented as

use Fcntl;   # For O_RDWR, O_CREAT, etc.
use SDBM_File; # Or whatever else was specified as the default DBM
implementation at Perl build

tie( my %pos_hash, 'SDBM_File', $file . '.idx', O_RDWR|O_CREAT, 0444)

> (i) where can I read about it

You can read about the XXXX_File modules in their respective
documentation.

>(ii) what advantages will it give me?

Well, in your case, the most obvious advantage is the ability to
control the file open mode.



------------------------------

Date: 20 Oct 2006 17:27:04 GMT
From: xhoster@gmail.com
Subject: Re: Tie, dbmopen and hashes
Message-Id: <20061020132833.599$7M@newsreader.com>

January Weiner <january.weiner@gmail.com> wrote:
>
> I have two questions:
>
> 1) I find that two programs cannot open the database at the same time. Is
>    this correct, even though I set the permissions to "read only" (0444)?

dbmopen by default automatically flocks the .pag file with LOCK_EX|LOCK_NB.

Just doing a "use DB_File;" before the dbmopen suppresses this (but
probably also changes the type of DBM module used to perform the dbmopen.)


>
> 2) At first, I wanted to use "tie". However, the documentation I read is
>    mostly about creating new classes, and I did not understand much from
>    it.

What docs did you read?

>    Is there a way of using "tie" which would behave as a simple
>    dbmopen, and if yes, (i) where can I read about it and

perldoc DB_File
perldoc NDBM_File
perldoc SDBM_File
perldoc AnyDBM_File

I tried to see what dbmopen caused to be loaded, by doing this:

$ perl -le 'my %foo=%INC; my %pos; dbmopen(%pos, "foo.idx", 0666) or warn
  $!; \ foreach (keys %INC) {print unless exists $foo{$_}}'

No locks available at -e line 1.
warnings/register.pm
XSLoader.pm
Carp.pm
NDBM_File.pm
Exporter.pm
strict.pm
Tie/Hash.pm
warnings.pm
AnyDBM_File.pm

So by default, dbmopen seems to use NDBM_File behind the scenes.



> (ii) what
>    advantages will it give me?

I would expect you could get more control over file locking, for one thing.

Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service                        $9.95/Month 30GB


------------------------------

Date: Fri, 20 Oct 2006 15:49:24 +0200
From: Philipp <sicsicsic@freesurf.ch>
Subject: Re: Validating a file name
Message-Id: <1161352164_43@sicinfo3.epfl.ch>

Yohan N Leder wrote:
> In article <1161347521_31@sicinfo3.epfl.ch>, sicsicsic@freesurf.ch 
> says...
> 
>>I have not been able to find anything like that browsing the perl faq.
>>
> 
> 
> http://search.cpan.org/~bch/Win32-Filenames-0.01/lib/Win32/Filenames.pm

Thanks! Exactly what I needed.
Phil


------------------------------

Date: Fri, 20 Oct 2006 15:51:49 +0200
From: Philipp <sicsicsic@freesurf.ch>
Subject: Re: Validating a file name
Message-Id: <1161352309_45@sicinfo3.epfl.ch>

David Lee Lambert wrote:
> On Fri, 20 Oct 2006 14:32:01 +0200, Philipp wrote:
> 
> 
>>I want to create a file with a filename coming from a string (which I 
>>have no control over).
>>Is there a package to check if the string is valid as a filename on the 
>>OS (my case Win32)?
>>
>>The package would check for:
>>  - invalid characters (: / \ ? etc)
>>  - control characters
>>  - length of string (max 215 chars on win32 I think)
>>etc.
>>
>>I have not been able to find anything like that browsing the perl faq.
> 
> 
> Well,  the MS Tablet PC API accepts the following pattern for file-names:
> 
> * For file name, allows all IS_ONECHAR characters except: \ / : < > |
> 
> This is easy to test with regular expressions,  e.g.
> 
>   $fn_ok = ($fn =~ m!^[^\\/:<>|]+$! and $fn !~ m![\x00-\x1f\x7f-\x9f]!)
> 
> However,  in input validation,  it's generally better to test for
> compliance with the rules rather than nondangerousness.  Something like
> one of these would probably be better,  and more readable:
> 
>   $fn_ok - ($fn =~ m!^[a-z0-9 _+-]+$!i);  # ASCII alphanumeric, space,
>                                           # hyphen, underline, plus
>   
>   $fn_ok = ($fn =~ m!^[0-9\pL-]$);  # Unicode letter, ASCII number, hyphen
> 

Thank you for your answer.
I tried this kind of validation (regexp), but it failed on my lack of 
knowledge (I didn't know exactly what characters are allowed in win 
filename).

Phil


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 9872
***************************************


home help back first fref pref prev next nref lref last post