Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tied hash magic ignored in references the first time 'round #15432

Open
p5pRT opened this issue Jul 9, 2016 · 14 comments
Open

Tied hash magic ignored in references the first time 'round #15432

p5pRT opened this issue Jul 9, 2016 · 14 comments

Comments

@p5pRT
Copy link

p5pRT commented Jul 9, 2016

Migrated from rt.perl.org#128588 (status was 'open')

Searchable as RT128588$

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2016

From @sanko

A few years ago, I reported a tie related bug where the magic was being ignored on scalar values until they're accessed. Now, I'm back with a similar bug​: tie magic is ignored by ref() (and likely others but ref() makes it obvious) when handed references to parts of a tied structure unless that reference is accessed (not modified) by other means first. Consider the following...

use strict;
use warnings;
use Tie​::Hash;
$|++;

sub _debug {
  my ($ref) = @​_;
  warn ref $ref;
  warn ${$ref}->{title}; # Accessing part of it == all better
  warn ref $ref;
}

# Test with tied hash ref
tie my %tied, 'Tie​::StdHash';
$tied{record} = {id => 1,
  title => 'Record'
};
_debug(\$tied{record});

# Verify with simple hashref
my %simple;
$simple{record} = {id => 1,
  title => 'Record'
};
_debug(\$simple{record});

...the output from this reads this way​:

SCALAR at debug.pl line 11.
Record at debug.pl line 12.
REF at debug.pl line 13.
REF at debug.pl line 11.
Record at debug.pl line 12.
REF at debug.pl line 13.

With the tied hash, ref() incorrectly returns SCALAR until we access any of the reference's contents. After that, we get the correct REF reftype. Just dumping the structure or printing any of the values is enough to wake perl's internals up to the tied magic.

All code written/tested on perl v5.20.1 built for MSWin32-x64-multi-thread.

Previous (only minimally related) ticket​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=120102
Downstream bug that got me here​: sanko/readonly#25

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2016

From @cpansprout

On Sat Jul 09 14​:46​:08 2016, sanko wrote​:

A few years ago, I reported a tie related bug where the magic was
being ignored on scalar values until they're accessed. Now, I'm back
with a similar bug​: tie magic is ignored by ref() (and likely others
but ref() makes it obvious) when handed references to parts of a tied
structure unless that reference is accessed (not modified) by other
means first.

I have known about that behaviour for years, but I was not sure that it was a bug, and I was hesitant to change it because most perl operators will call FETCH on their operands, but not on their operands’ referents. (This is related to #77502.)

In other words​:

  ref $var

calls FETCH on $var if it is tied, but

  ref \$var

does not call FETCH, because $var is not the operand to ref().

Most of the time, perl tries to call FETCH exactly once for each mentioned variable. To fix the bug would require a FETCH on $bar on line 2 of this example, even though that line does not mention $bar​:

  $foo = \$bar;
  ref $foo;

That said, it can be a real problem (e.g., for your module), so it may be worth fixing.

What do others think?

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2016

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2016

From @cpansprout

On Sat Jul 09 17​:39​:29 2016, sprout wrote​:

That said, it can be a real problem (e.g., for your module), so it may
be worth fixing.

What do others think?

I have a good counterexample. If you have dumping code that tries to dump data structures without any side effects (including tied vars), it will no longer be able to use ref $ref_to_var_we_are_dumping without potentially calling code.

So we have two use cases competing against each other. (Perl 5’s data model is wonderful but nuts.)

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2016

From [email protected]

Father Chrysostomos via RT wrote​:

What do others think?

I think ref() shouldn't have been defined to distinguish between types of
scalar referent. But given that that is a visible part of what it does,
it seems necessary for it to handle magic on the referent.

-zefram

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2016

From [email protected]

Father Chrysostomos via RT wrote​:

I have a good counterexample. If you have dumping code that tries to
dump data structures without any side effects (including tied vars),
it will no longer be able to use ref $ref_to_var_we_are_dumping without
potentially calling code.

That code is going to have difficulty anyway in dumping data without
invoking ties. ref() is far from a unique point of difficulty; how
do you imagine it acquiring a string value to dump? The only way to
walk the data structure without invoking magic is by custom XS code,
and obviously that's not going to be affected by ref().

-zefram

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2016

From @cpansprout

On Sat Jul 09 17​:50​:30 2016, zefram@​fysh.org wrote​:

Father Chrysostomos via RT wrote​:

I have a good counterexample. If you have dumping code that tries to
dump data structures without any side effects (including tied vars),
it will no longer be able to use ref $ref_to_var_we_are_dumping without
potentially calling code.

That code is going to have difficulty anyway in dumping data without
invoking ties. ref() is far from a unique point of difficulty; how
do you imagine it acquiring a string value to dump?

You dump the object it is tied to instead. But I suppose it is easy enough to check tiedness before checking ref.

The only way to
walk the data structure without invoking magic is by custom XS code,
and obviously that's not going to be affected by ref().

-zefram

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2016

From @cpansprout

On Sat Jul 09 17​:46​:33 2016, zefram@​fysh.org wrote​:

Father Chrysostomos via RT wrote​:

What do others think?

I think ref() shouldn't have been defined to distinguish between types of
scalar referent. But given that that is a visible part of what it does,
it seems necessary for it to handle magic on the referent.

Yes, that ref() does three different things was an unfortunate design flaw.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2016

From @cpansprout

On Sat Jul 09 17​:46​:33 2016, zefram@​fysh.org wrote​:

But given that that is a visible part of what it does,
it seems necessary for it to handle magic on the referent.

The attached patch fixes it, but causes two test failures.

This code in Test​::Builder fails with the attached patch​:

sub is_fh {
  my $self = shift;
  my $maybe_fh = shift;
  return 0 unless defined $maybe_fh;

  return 1 if ref $maybe_fh eq 'GLOB'; # its a glob ref
  return 1 if ref \$maybe_fh eq 'GLOB'; # its a glob

  return eval { $maybe_fh->isa("IO​::Handle") } ||
  eval { tied($maybe_fh)->can('TIEHANDLE') };
}

The test in question is cpan/Test-Simple/t/Legacy/Builder/is_fh.t, which does​:

ok( Test​::Builder->is_fh(\*OUT) );

after tying *OUT to a handle package with no FETCH method.

I think there is some other bug here, because FETCH should not be called on a typeglob that was tied as such, because only its IO slot should have the magic.

The other failing test is ext/Hash-Util-FieldHash/t/11_hashassign.t, which gives me​:

Rogue call of 'HUF_watch_key_safe' at lib/Test/Builder.pm line 689.

Line 689 of Builder.pm is​:

  return unless ref $$thing;

which seems pretty innocuous.

The error is coming from ext/Hash-Util-FieldHash.xs. I have not looked into why.

As I feared, fixing this is not so simple, as code seems to be relying on the existing behaviour.

Also, the patch as it is does not account for FETCH undefining the reference. I think it will crash in that case.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2016

From @cpansprout

Inline Patch
diff --git a/pp.c b/pp.c
index 3df0bea..47c354c 100644
--- a/pp.c
+++ b/pp.c
@@ -596,6 +596,7 @@ PP(pp_ref)
     else {
 	dTARGET;
 	SETs(TARG);
+	SvGETMAGIC(SvRV(sv));
 	/* use the return value that is in a register, its the same as TARG */
 	TARG = sv_ref(TARG,SvRV(sv),TRUE);
 	SvSETMAGIC(TARG);
diff --git a/t/op/ref.t b/t/op/ref.t
index 84d9217..80a445d 100644
--- a/t/op/ref.t
+++ b/t/op/ref.t
@@ -8,7 +8,7 @@ BEGIN {
 
 use strict qw(refs subs);
 
-plan(236);
+plan(237);
 
 # Test this first before we extend the stack with other operations.
 # This caused an asan failure due to a bad write past the end of the stack.
@@ -228,6 +228,16 @@ like (*STDOUT{IO}, qr/^IO::File=IO\(0x[0-9a-f]+\)$/,
   ok exists { ____ => undef }->{$dummy}, 'ref sets UTF8 flag correctly';
 }
 
+# [perl #128588]
+{
+    package t;
+    sub TIESCALAR {bless[]}
+    sub FETCH { *didnt_expect_a_glob_did_you }
+    package main;
+    tie my $t, t::;
+    is ref \$t, 'GLOB', 'ref() calls get-magic on the referent';
+}
+
 # Test anonymous hash syntax.
 
 $anonhash = {};

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2016

From [email protected]

Father Chrysostomos via RT wrote​:

Also, the patch as it is does not account for FETCH undefining the
reference. I think it will crash in that case.

Yes. I think it should do the dereference (SvRV(sv)) exactly once, and
hold a reference during SvGETMAGIC(). Bummer that that requires diddling
refcounts, in an operation that didn't require it before. However, it
can be avoided in most cases. It would suffice to create a mortal ref
only in the cases where the flags say there is magic. Possibly also
only in the cases where the referent is an unblessed scalar, though
using those conditions would pretty much require the magic handling to
be moved down into sv_reftype().

I wonder about other callers to sv_ref() and especially sv_reftype().
Many presumably have already handled magic, but maybe some ought to have
it added.

-zefram

@p5pRT
Copy link
Author

p5pRT commented Jul 13, 2016

From @xsawyerx

On 07/10/2016 02​:39 AM, Father Chrysostomos via RT wrote​:

[...]
What do others think?

This will change to support one depth of reference. What about two? What
about three? I probably misunderstood something but that seems an
arbitrary depth to support.

@p5pRT
Copy link
Author

p5pRT commented Jul 13, 2016

From @xsawyerx

On 07/13/2016 01​:13 PM, Sawyer X wrote​:

On 07/10/2016 02​:39 AM, Father Chrysostomos via RT wrote​:

[...]
What do others think?
This will change to support one depth of reference. What about two? What
about three? I probably misunderstood something but that seems an
arbitrary depth to support.

D'oh. I was about to send a comment about the patch as well and read the
code in pp.c to get context and then realized the answer to my question​:
It's already there. (Zefram made this comment as well.)

Ignore my email. Sorry. :)

@p5pRT
Copy link
Author

p5pRT commented Jul 13, 2016

From @cpansprout

On Sat Jul 09 18​:59​:39 2016, sprout wrote​:

The other failing test is ext/Hash-Util-FieldHash/t/11_hashassign.t,
which gives me​:

Rogue call of 'HUF_watch_key_safe' at lib/Test/Builder.pm line 689.

Line 689 of Builder.pm is​:

return unless ref $$thing;

which seems pretty innocuous.

The error is coming from ext/Hash-Util-FieldHash.xs. I have not
looked into why.

This particular failure is due to calling SvGETMAGIC on a hash. It should be verified that the referent is a scalar (< SVt_PVAV) before doing that.

I have yet to look into the glob failure, which still baffles me.

--

Father Chrysostomos

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants