Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to correctly taint "%this"? #109

Closed
MXWXZ opened this issue May 27, 2024 · 5 comments
Closed

How to correctly taint "%this"? #109

MXWXZ opened this issue May 27, 2024 · 5 comments

Comments

@MXWXZ
Copy link

MXWXZ commented May 27, 2024

Overall Description

Hi, I want to write a plugin that can taint any class when any field of the class is tainted (e.g, when a tainted variable is passed to obj.setName(String), I want the obj is tainted as well after that). However, the caller variable is not affected even through the "%this" variable in the callee function is tainted. So how to correctly handle this?

You can refer to the minimum reproduce code below that the "%this" in callee function is tainted

[]:<org.example.User: void setName(java.lang.String)>/%this -> [[]:MergedObj{}, []:NewObj{<org.example.Main: void process(java.lang.String)>[0@L8] new org.example.User}, []:TaintObj{alloc=<org.example.Main: void process(java.lang.String)>/0,type=java.lang.String}]

However the caller "%this" is untouched

[]:<org.example.Main: void process(java.lang.String)>/$r0 -> [[]:NewObj{<org.example.Main: void process(java.lang.String)>[0@L8] new org.example.User}]

I may misunderstand some parts of the pta analysis, do I need to manually propagate this (why not be handled automatically)?

Expected Behavior

The caller object is also tainted.

Current Behavior

The caller object is not tainted.

Tai-e Arguments

Click here to see Tai-e Options
optionsFile: null
printHelp: false
classPath: []
appClassPath:
- tester-1.0-SNAPSHOT.jar
mainClass: org.example.Main
inputClasses: []
javaVersion: 8
prependJVM: false
allowPhantom: true
worldBuilderClass: pascal.taie.frontend.soot.SootWorldBuilder
outputDir: output
preBuildIR: false
worldCacheMode: false
scope: ALL
nativeModel: true
planFile: null
analyses:
ir-dumper: ;
pta: "taint-config:config.yml;plugins:[pascal.taie.analysis.pta.plugin.taint.SuperTaintHandler];dump:true"
onlyGenPlan: false
keepResult:
- $KEEP-ALL

Tai-e Log

Click here to see IR log
  static void process(java.lang.String r1) {
      org.example.User $r0;
      [0@L8] $r0 = new org.example.User;
      [1@L8] invokespecial $r0.<org.example.User: void <init>()>();
      [2@L9] invokevirtual $r0.<org.example.User: void setName(java.lang.String)>(r1);
      [3@L10] invokestatic <org.example.Main: void sk(org.example.User)>($r0);
      [4@L11] return;
  }

  public void setName(java.lang.String name) {
      [0@L9] %this.<org.example.User: java.lang.String name> = name;
      [1@L10] return;
  }
Click here to see points-to results
[]:<org.example.Main: void main(java.lang.String[])>/%stringconst0 -> [[]:MergedObj{<Merged string constants>}]
[]:<org.example.Main: void main(java.lang.String[])>/r0 -> [[]:EntryPointObj{alloc=MethodParam{<org.example.Main: void main(java.lang.String[])>/0},type=java.lang.String[] in <org.example.Main: void main(java.lang.String[])>}]
[]:<org.example.Main: void process(java.lang.String)>/$r0 -> [[]:NewObj{<org.example.Main: void process(java.lang.String)>[0@L8] new org.example.User}]
[]:<org.example.Main: void process(java.lang.String)>/r1 -> [[]:MergedObj{<Merged string constants>}, []:TaintObj{alloc=<org.example.Main: void process(java.lang.String)>/0,type=java.lang.String}]
[]:<org.example.Main: void sk(org.example.User)>/$r1 -> [[]:NewObj{<java.lang.System: java.io.PrintStream newPrintStream(java.io.FileOutputStream,java.lang.String)>[1@L1147] new java.io.PrintStream}, []:NewObj{<java.lang.System: java.io.PrintStream newPrintStream(java.io.FileOutputStream,java.lang.String)>[9@L1150] new java.io.PrintStream}]
[]:<org.example.Main: void sk(org.example.User)>/$r2 -> [[]:MergedObj{<Merged string constants>}, []:TaintObj{alloc=<org.example.Main: void process(java.lang.String)>/0,type=java.lang.String}]
[]:<org.example.Main: void sk(org.example.User)>/r0 -> [[]:NewObj{<org.example.Main: void process(java.lang.String)>[0@L8] new org.example.User}]
[]:<org.example.User: java.lang.String getName()>/$r1 -> [[]:MergedObj{<Merged string constants>}, []:TaintObj{alloc=<org.example.Main: void process(java.lang.String)>/0,type=java.lang.String}]
[]:<org.example.User: java.lang.String getName()>/%this -> [[]:NewObj{<org.example.Main: void process(java.lang.String)>[0@L8] new org.example.User}]
[]:<org.example.User: void <init>()>/%this -> [[]:NewObj{<org.example.Main: void process(java.lang.String)>[0@L8] new org.example.User}]
[]:<org.example.User: void setName(java.lang.String)>/%this -> [[]:MergedObj{<Merged string constants>}, []:NewObj{<org.example.Main: void process(java.lang.String)>[0@L8] new org.example.User}, []:TaintObj{alloc=<org.example.Main: void process(java.lang.String)>/0,type=java.lang.String}]
[]:<org.example.User: void setName(java.lang.String)>/name -> [[]:MergedObj{<Merged string constants>}, []:TaintObj{alloc=<org.example.Main: void process(java.lang.String)>/0,type=java.lang.String}]

Additional Information

Click here to see key code of my plugin
// It may not be efficient, any suggestions to improve? 
@Override
  public void onNewCSMethod(CSMethod csMethod) {
      JMethod method = csMethod.getMethod();
      Context context = csMethod.getContext();
      IR ir = method.getIR();
      for (Stmt i : ir.getStmts()) {
          if (i instanceof StoreField) {
              StoreField st = (StoreField) i;
              if (st.getLValue() instanceof InstanceFieldAccess) {
                  InstanceFieldAccess lv = (InstanceFieldAccess) st.getLValue();
                  if (lv.getBase().toString() == "%this" && lv.getFieldRef().getDeclaringClass() == method.getDeclaringClass()
                          && !st.getRValue().isConst()) {
                      CSVar from = solver.getCSManager().getCSVar(context, st.getRValue());
                      CSVar to = solver.getCSManager().getCSVar(context, lv.getBase());
                      solver.addPFGEdge(from, to, FlowKind.LOCAL_ASSIGN);
                  }
              }
          }
      }
  }
Click here to see minimum reproduce code
package org.example;

public class Main {
  static void sk(User u){
      System.out.println(u.getName());
  }
  static void process(String src){
      User s=new User();
      s.setName(src);
      sk(s);
  }
  public static void main(String[] args) {
      process("xxx");
  }
}

package org.example;

public class User {
  public String getName() {
      return name;
  }

  public void setName(String name) {
      this.name = name;
  }

  private String name;

}
@zhangt2333
Copy link
Member

Hello,

Thank you for providing such a detailed description and information about the issue. This helps reduce the number of interactions needed, which is greatly appreciated by open source maintainers.

Before addressing your question, I have a side note. What prompted you to modify the placeholder for Tai-e Log in our New Issue Template? We intend it to provide runtime information originally, such as Tai-e Commit: d610a880a2c05968c9e60400f2041f281dee809f and java.runtime.version: 17.0.6+10, among other details. This is not a complaint, just a user study. 😆


[]:<org.example.User: void setName(java.lang.String)>/%this -> [[]:MergedObj{}, []:NewObj{<org.example.Main: void process(java.lang.String)>[0@L8] new org.example.User}, []:TaintObj{alloc=<org.example.Main: void process(java.lang.String)>/0,type=java.lang.String}]

My Intuition: In this points-to set, a %this variable of type org.example.User points to a String, which doesn't follow the type system. You should mock another TaintObj with the same type as %this.

I want to write a plugin that can taint any class when any field of the class is tainted

You mentioned "Any", so it is temporarily not achievable directly in the current Tai-e. Because Writing taint configuration programmatically is our future plan. It's being incubated.

But if you are in urgent need, I provide a simple idea below for your customized implementation: monitor all changes in the points-to set of all InstanceFields. If a TaintObj appears, mock a TaintObj which pointed to by the InstanceField's Instance's var.

@MXWXZ
Copy link
Author

MXWXZ commented May 28, 2024

What prompted you to modify the placeholder for Tai-e Log in our New Issue Template?

I found my tai-e.log is always empty :(. In case some information are needed:

commit: 47bdb8b2361083151a44ba76ee2f9f2dbd363b40
java: Java(TM) SE Runtime Environment (build 17.0.11+7-LTS-207)

You should mock another TaintObj with the same type as %this.

Thanks for your help. I wrote another transfer to do this:

public class ThisTransfer implements Transfer {
    HeapModel heap;

    public ThisTransfer(HeapModel heap) {
        this.heap = heap;
    }

    @Override
    public PointsToSet apply(PointerFlowEdge edge, PointsToSet input) {
        if (edge.target() instanceof CSVar && ((CSVar) edge.target()).getVar().toString() == "%this") {
            List<CSObj> append = new ArrayList<>();
            input.forEach(o -> {
                if (o.getObject() instanceof MockObj mo && mo.toString().startsWith("TaintObj")) {
                    if (mo.getType() != edge.target().getType()) {
                        append.add(new CSObj(heap.getMockObj(mo.getDescriptor(), mo.getAllocation(),
                                edge.target().getType(), mo.getContainerMethod().orElse(null), mo.isFunctional()), o.getContext(), o.getIndex()));
                    }
                }
            });
            append.forEach(input::addObject);
        }
        return input;
    }
}

and invoke in the plugin

solver.addPFGEdge(new PointerFlowEdge(FlowKind.LOCAL_ASSIGN, from, to), new ThisTransfer(solver.getHeapModel()));

[]:<org.example.User: void setName(java.lang.String)>/%this -> [[]:MergedObj{}, []:NewObj{<org.example.Main: void process(java.lang.String)>[0@L8] new org.example.User}, []:TaintObj{alloc=<org.example.Main: void process(java.lang.String)>/0,type=java.lang.String}, []:TaintObj{alloc=<org.example.Main: void process(java.lang.String)>/0,type=org.example.User}]
[]:<org.example.User: void setName(java.lang.String)>/name -> [[]:MergedObj{}, []:TaintObj{alloc=<org.example.Main: void process(java.lang.String)>/0,type=java.lang.String}]

From the log I can ensure the callee %this is tainted with correct type, however, the call site is

[]:<org.example.Main: void process(java.lang.String)>/$r0 -> [[]:NewObj{<org.example.Main: void process(java.lang.String)>[0@L8] new org.example.User}]

There is still no tainted object here. How can I notify or add the taint object to this? I think this should be handled by Tai-e automatically but something must be wrong.

@zhangt2333
Copy link
Member

zhangt2333 commented May 28, 2024

I found my tai-e.log is always empty :(.

Is the whole thing empty? If so, there may be some potential errors.

commit: 47bdb8b

So this is not the latest code, it will not print runtime information (introduced by e87bce9). It makes sense.


  static void process(java.lang.String r1) {
      org.example.User $r0;
      [0@L8] $r0 = new org.example.User;
      [1@L8] invokespecial $r0.<org.example.User: void <init>()>();
      [2@L9] invokevirtual $r0.<org.example.User: void setName(java.lang.String)>(r1);
      [3@L10] invokestatic <org.example.Main: void sk(org.example.User)>($r0);
      [4@L11] return;
  }

  public void setName(java.lang.String name) {
      [0@L9] %this.<org.example.User: java.lang.String name> = name;
      [1@L10] return;
  }

[]:<org.example.User: void setName(java.lang.String)>/%this -> [[]:MergedObj{}, []:NewObj{<org.example.Main: void process(java.lang.String)>[0@L8] new org.example.User}, []:TaintObj{alloc=<org.example.Main: void process(java.lang.String)>/0,type=java.lang.String}, []:TaintObj{alloc=<org.example.Main: void process(java.lang.String)>/0,type=org.example.User}]

From the log I can ensure the callee %this is tainted with correct type, however, the call site is

[]:<org.example.Main: void process(java.lang.String)>/$r0 -> [[]:NewObj{<org.example.Main: void process(java.lang.String)>[0@L8] new org.example.User}]

There is still no tainted object here. How can I notify or add the taint object to this? I think this should be handled by Tai-e automatically but something must be wrong.

What you do is User.setName/%this <- TaintObj. Everything is correct, and Tai-e has done what it should do.

<org.example.Main: void process(java.lang.String)>/$r0 -> [NewObj]
<org.example.User: void setName(java.lang.String)>/%this -> [NewObj, TaintObj]

$r0 and %this are two variables in different methods' IR. The $r0 will propagate to %this; it is because [2@L9] invokevirtual $r0.<org.example.User: void setName(java.lang.String)>(r1); create a PFG Edge from Main.process/$r0 to User.setName/%this. However, %this will not backpropagate to $r0.


A simple idea might be more like the one I suggested. I'm not sure if your implementation will fully meet this requirement; it could potentially introduce additional issues.

@MXWXZ
Copy link
Author

MXWXZ commented May 28, 2024

Solved, similar to what you suggested. I manually add taint obj to all invoke sites when ThisTransfer apply, not only add to %this.
Others can refer to these basic code. (Classes like CSObj need to make public manually, or use reflection if possible). It can also be made publicly in main branch anyway.
I do not guarantee the completeness but should work for most cases.

SuperTaintHandler.java
public class SuperTaintHandler implements Plugin {
    private Solver solver;

    @Override
    public void setSolver(Solver solver) {
        this.solver = solver;
    }


    @Override
    public void onNewCSMethod(CSMethod csMethod) {
        JMethod method = csMethod.getMethod();
        Context context = csMethod.getContext();
        IR ir = method.getIR();
        for (Stmt i : ir.getStmts()) {
            if (i instanceof StoreField st) {
                if (st.getLValue() instanceof InstanceFieldAccess lv) {
                    if (lv.getBase().toString() == "%this" && !st.getRValue().isConst()) {
                        CSVar from = solver.getCSManager().getCSVar(context, st.getRValue());
                        CSVar to = solver.getCSManager().getCSVar(context, lv.getBase());
                        Set<Var> varList = new HashSet<>();
                        for (Edge<CSCallSite, CSMethod> e : csMethod.getEdges()) {
                            InvokeInstanceExp exp = (InvokeInstanceExp) e.getCallSite().getCallSite().getInvokeExp();
                            varList.add(exp.getBase());
                        }
                        solver.addPFGEdge(new PointerFlowEdge(FlowKind.LOCAL_ASSIGN, from, to), new ThisTransfer(solver, varList));
                    }
                }
            }
        }
    }
}
ThisTransfer.java
public class ThisTransfer implements Transfer {
    Solver solver;

    Set<Var> varList;

    public ThisTransfer(Solver solver, Set<Var> varList) {
        this.solver = solver;
        this.varList = varList;
    }


    @Override
    public PointsToSet apply(PointerFlowEdge edge, PointsToSet input) {
        if (edge.target() instanceof CSVar && ((CSVar) edge.target()).getVar().toString() == "%this") {
            List<CSObj> append = new ArrayList<>();
            input.forEach(o -> {
                if (o.getObject() instanceof MockObj mo && mo.toString().startsWith("TaintObj")) {
                    if (mo.getType() != edge.target().getType()) {
                        CSObj taint = new CSObj(solver.getHeapModel().getMockObj(mo.getDescriptor(), mo.getAllocation(),
                                edge.target().getType(), mo.getContainerMethod().orElse(null), mo.isFunctional()), o.getContext(), o.getIndex());
                        varList.forEach(var -> {
                            CSVar csvar = solver.getCSManager().getCSVar(o.getContext(), var);
                            PointsToSet set = csvar.getPointsToSet();
                            set.addObject(taint);
                            csvar.setPointsToSet(set);
                        });
                        append.add(taint);
                    }
                }
            });
            append.forEach(input::addObject);
        }
        return input;
    }
}

For the example code above, it should generate

[]:<org.example.User: void setName(java.lang.String)>/%this -> [[]:MergedObj{}, []:NewObj{<org.example.Main: void process(java.lang.String)>[0@L8] new org.example.User}, []:TaintObj{alloc=<org.example.Main: void process(java.lang.String)>/0,type=java.lang.String}, []:TaintObj{alloc=<org.example.Main: void process(java.lang.String)>/0,type=org.example.User}]
[]:<org.example.User: void setName(java.lang.String)>/name -> [[]:MergedObj{}, []:TaintObj{alloc=<org.example.Main: void process(java.lang.String)>/0,type=java.lang.String}]
[]:<org.example.Main: void process(java.lang.String)>/$r0 -> [[]:NewObj{<org.example.Main: void process(java.lang.String)>[0@L8] new org.example.User}, []:TaintObj{alloc=<org.example.Main: void process(java.lang.String)>/0,type=org.example.User}]

Is the whole thing empty?

Yes, I have also updated to the latest version. The log file is completely empty.

git log
commit b848c52 (HEAD -> master, origin/master, origin/HEAD)

Console output
D:\taie\build>java -jar tai-e-all-0.5.1-SNAPSHOT.jar --options-file=options.yml
Tai-e starts ...
Output directory: D:\taie\build\output
Writing options to D:\taie\build\output\options.yml
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
Writing log to D:\taie\build\output\tai-e.log
java.version: 17.0.11
java.version.date: 2024-04-16
java.runtime.version: 17.0.11+7-LTS-207
java.vendor: Oracle Corporation
java.vendor.version: null
os.name: Windows 10
os.version: 10.0
os.arch: amd64
Tai-e Version: 0.5.1-SNAPSHOT
Tai-e Commit: d610a880a2c05968c9e60400f2041f281dee809f
.....

Anyway, appreciate for your immediate help and develop such a useful tool! Cheers.

@zhangt2333
Copy link
Member

Yes, I have also updated to the latest version. The log file is completely empty.

Fixed in cfd0fb7.

@MXWXZ MXWXZ closed this as completed Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants