Skip to content

Implemented draft of RMMR algorithm. #41

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 43 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
174fcd2
Implemented draft of RMMR algorithm.
RamSaw Mar 28, 2018
6f61018
Made several changes: corrected wrong behaviour because of missing br…
RamSaw Apr 11, 2018
0fd31dd
Corrected test: it ran all algorithms, but assertions for RMMR algo a…
RamSaw Apr 11, 2018
bc85cad
Corrected marks from pull request.
RamSaw Apr 15, 2018
df7b1e0
Added tests for MovieRentalStore example.
RamSaw Apr 24, 2018
08536b2
Added javadocs and made code clean up.
RamSaw Apr 24, 2018
538ff85
Changed to run parallel. Parallel executing is not comfortable for de…
RamSaw Apr 24, 2018
59b19a2
Made RMMR constructor public. IDEA says it can be package-private but…
RamSaw Apr 24, 2018
1fce0bc
Made RmmrEntitySearcher more consistent and turned to qualified names.
RamSaw May 21, 2018
79ebe6b
Moved 0.01 hardcoded constant to static final constant.
RamSaw May 21, 2018
22ac2b6
Removed new expressions, added Builders, Utils, Factory cases, change…
RamSaw May 29, 2018
e227e2b
Added contextual similarity calculation. Very raw version. TODO: opti…
RamSaw Jul 6, 2018
2d0ddd0
Fixed null pointer exception by change get to getOrDefault. Also adde…
RamSaw Jul 6, 2018
e9f71fd
Changed to multisets and all maps from entity moved to entity class t…
RamSaw Jul 6, 2018
00333f5
Deleted redundant documents sustaining. Now it consists of two refere…
RamSaw Jul 9, 2018
2c4d416
Added support of configuration of algorithm.
RamSaw Jul 9, 2018
c1379f4
Added stemming.
RamSaw Jul 9, 2018
2400fb5
Reverted ArchitectureReloaded.iml file.
RamSaw Jul 9, 2018
66c2b2c
fix conflicts commit
RamSaw Jul 9, 2018
826ee53
Added jar for stemming
RamSaw Jul 9, 2018
83168b2
Removed qualified name error fix because tests were not passing.
RamSaw Jul 9, 2018
9eef351
Merge branch 'RMMR_implementation' of https://github.com/ml-in-progra…
RamSaw Jul 9, 2018
42a7471
Erased bug with qualified names and added excluding short names like …
RamSaw Jul 9, 2018
08740ec
Added tests for RMMR and added support for setting up or not field re…
RamSaw Jul 9, 2018
ebe7152
Speeded up algorithm by adding coordinates that are only not null. Th…
RamSaw Jul 10, 2018
b9e8966
Fixed tests. Failing test were uncommented by mistake.
RamSaw Jul 10, 2018
28153b7
Algorithm was optimized more, now bag finder class deleted and its lo…
RamSaw Jul 10, 2018
d2f1015
Found mistake by dividing on zero. Fixed and as a consequence move me…
RamSaw Jul 10, 2018
c0503ef
RMMR was set up to pass more test cases. On real project output is no…
RamSaw Jul 10, 2018
9f884cd
Ignored tests in RmmrDistancesTest and RmmrEntitySearcherTest. Reason…
RamSaw Jul 10, 2018
5cd9df9
Fixed tests failure. Ignored tests in RmmrDistancesTest and RmmrEntit…
RamSaw Jul 10, 2018
82ade8e
Added support for method references: ClassA::methodInClassA.
RamSaw Jul 10, 2018
bc238ca
Added support for class references. For example call of static method…
RamSaw Jul 10, 2018
a342283
Added failure explanations for all tests which fail.
RamSaw Jul 10, 2018
f15f724
Corrected weights for conceptual and contextual similarity.
RamSaw Jul 10, 2018
9bd97d4
Corrected accuracy formula - now output is a little bit better.
RamSaw Jul 11, 2018
fd94e63
Made clean up of code, code prepared for review. Deleted old test data.
RamSaw Jul 11, 2018
3d86756
Corrected accuracy formula to fit other algorithms. Before that accur…
RamSaw Jul 11, 2018
afa2096
Fixes #50
RamSaw Jul 11, 2018
9fc5780
Fixed test failure.
RamSaw Jul 11, 2018
ac4cf63
Reverted bug fix, because it introduced new bug.
RamSaw Jul 13, 2018
bd370e9
Merge remote-tracking branch 'origin/master' into RMMR_implementation
RamSaw Jul 16, 2018
c183397
Merge remote-tracking branch 'origin/master' into RMMR_implementation
RamSaw Jul 16, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
350 changes: 349 additions & 1 deletion ArchitectureReloaded.iml

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -24,5 +24,5 @@ dependencies {
compile project(':openapi')
compile project(':utils')
compile project(':stockmetrics')
compile files('lib/args4j-2.32.jar', 'lib/jcommon-0.9.1.jar', 'lib/jfreechart-0.9.16.jar')
compile files('lib/args4j-2.32.jar', 'lib/jcommon-0.9.1.jar', 'lib/jfreechart-0.9.16.jar', 'lib/PorterStemmer.jar')
}
4 changes: 2 additions & 2 deletions gradle/wrapper/gradle-wrapper.properties
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#Wed Jul 12 12:29:38 MSK 2017
#Mon Jul 09 12:44:35 MSK 2018
distributionBase=GRADLE_USER_HOME
distributionPath=wrapper/dists
zipStoreBase=GRADLE_USER_HOME
zipStorePath=wrapper/dists
distributionUrl=https\://services.gradle.org/distributions/gradle-2.10-bin.zip
distributionUrl=https\://services.gradle.org/distributions/gradle-2.10-all.zip
14 changes: 4 additions & 10 deletions openapi/openapi.iml
Original file line number Diff line number Diff line change
@@ -1,15 +1,9 @@
<?xml version="1.0" encoding="UTF-8"?>
<module relativePaths="true" type="JAVA_MODULE" version="4">
<component name="NewModuleRootManager" inherit-compiler-output="false">
<output url="file://$MODULE_DIR$/build/classes" />
<output-test url="file://$MODULE_DIR$/build/testclasses" />
<module external.linked.project.id="openapi" external.linked.project.path="$MODULE_DIR$" external.root.project.path="$MODULE_DIR$" external.system.id="GRADLE" type="JAVA_MODULE" version="4">
<component name="NewModuleRootManager" inherit-compiler-output="true">
<exclude-output />
<content url="file://$MODULE_DIR$">
<sourceFolder url="file://$MODULE_DIR$/src" isTestSource="false" />
<excludeFolder url="file://$MODULE_DIR$/build" />
</content>
<content url="file://$MODULE_DIR$" />
<orderEntry type="inheritedJdk" />
<orderEntry type="sourceFolder" forTests="false" />
</component>
</module>

</module>
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ private void setFields(OldEntity entity, ElementAttributes attributes) {

vectorField.set(entity, attributes.getRawFeatures());
vectorField.setAccessible(accessible);
} catch (NoSuchFieldException | IllegalAccessException e) {
} catch (NoSuchFieldException | IllegalAccessException ignored) {
}
}
}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,260 @@
/*
* Copyright 2018 Machine Learning Methods in Software Engineering Group of JetBrains Research
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.jetbrains.research.groups.ml_methods.algorithm;

import com.sixrr.metrics.Metric;
import org.apache.log4j.Logger;
import org.jetbrains.annotations.NotNull;
import org.jetbrains.research.groups.ml_methods.algorithm.entity.ClassOldEntity;
import org.jetbrains.research.groups.ml_methods.algorithm.entity.EntitySearchResult;
import org.jetbrains.research.groups.ml_methods.algorithm.entity.MethodOldEntity;
import org.jetbrains.research.groups.ml_methods.algorithm.refactoring.Refactoring;
import org.jetbrains.research.groups.ml_methods.config.Logging;
import org.jetbrains.research.groups.ml_methods.utils.AlgorithmsUtil;

import java.util.*;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.stream.Collectors;

import static java.lang.Math.*;

/**
* Implementation of RMMR (Recommendation of Move Method Refactoring) algorithm.
* Based on @see <a href="https://drive.google.com/file/d/17yAlVXRaLuhIcXB4PEzNiZj5p1oi4HlL/view">this article</a>.
*/
// TODO: maybe consider that method and target class are in different packages?
public class RMMR extends OldAlgorithm {
/** Internal name of the algorithm in the program */
public static final String NAME = "RMMR";
private static final boolean ENABLE_PARALLEL_EXECUTION = false;
/** Describes minimal accuracy that algorithm accepts */
private final static double MIN_ACCURACY = 0.01;
/** Describes accuracy that is pretty confident to do refactoring */
private final static double GOOD_ACCURACY_BOUND = 0.5;
/** Describes accuracy higher which accuracy considered as max = 1 */
private final static double MAX_ACCURACY_BOUND = 0.7;
/** Describes power to which stretched accuracy will be raised */
// value is to get this result: GOOD_ACCURACY_BOUND go to MAX_ACCURACY_BOUND
private final static double POWER_FOR_ACCURACY = log(MAX_ACCURACY_BOUND) / log(GOOD_ACCURACY_BOUND / MAX_ACCURACY_BOUND);
private static final Logger LOGGER = Logging.getLogger(RMMR.class);
private final Map<ClassOldEntity, Set<MethodOldEntity>> methodsByClass = new HashMap<>();
private final List<MethodOldEntity> units = new ArrayList<>();
/** Classes to which method will be considered for moving */
private final List<ClassOldEntity> classEntities = new ArrayList<>();
private final AtomicInteger progressCount = new AtomicInteger();
/** Context which stores all found classes, methods and its metrics (by storing OldEntity) */
private OldExecutionContext context;

public RMMR() {
super(NAME, true);
}

@Override
@NotNull
protected List<Refactoring> calculateRefactorings(@NotNull OldExecutionContext context, boolean enableFieldRefactorings) {
if (enableFieldRefactorings) {
LOGGER.error("Field refactorings are not supported",
new UnsupportedOperationException("Field refactorings are not supported"));
}
this.context = context;
init();

if (ENABLE_PARALLEL_EXECUTION) {
return context.runParallel(units, ArrayList::new, this::findRefactoring, AlgorithmsUtil::combineLists);
} else {
List<Refactoring> accum = new LinkedList<>();
units.forEach(methodEntity -> findRefactoring(methodEntity, accum));
return accum;
}
}

/** Initializes units, methodsByClass, classEntities. Data is gathered from context.getEntities() */
private void init() {
final EntitySearchResult entities = context.getEntities();
LOGGER.info("Init RMMR");
units.clear();
classEntities.clear();
methodsByClass.clear();

classEntities.addAll(entities.getClasses());
units.addAll(entities.getMethods());
progressCount.set(0);

entities.getMethods().forEach(methodEntity -> {
List<ClassOldEntity> methodClassEntity = entities.getClasses().stream()
.filter(classEntity -> methodEntity.getClassName().equals(classEntity.getName()))
.collect(Collectors.toList());
if (methodClassEntity.size() != 1) {
LOGGER.error("Found more than 1 class that has this method");
}
methodsByClass.computeIfAbsent(methodClassEntity.get(0), anyKey -> new HashSet<>()).add(methodEntity);
});
}

/**
* Methods decides whether to move method or not, based on calculating distances between given method and classes.
*
* @param entity method to check for move method refactoring.
* @param accumulator list of refactorings, if method must be moved, refactoring for it will be added to this accumulator.
* @return changed or unchanged accumulator.
*/
@NotNull
private List<Refactoring> findRefactoring(@NotNull MethodOldEntity entity, @NotNull List<Refactoring> accumulator) {
context.reportProgress((double) progressCount.incrementAndGet() / units.size());
context.checkCanceled();
if (!entity.isMovable() || classEntities.size() < 2) {
return accumulator;
}
double minDistance = Double.POSITIVE_INFINITY;
double difference = Double.POSITIVE_INFINITY;
double distanceWithSourceClass = 1;
ClassOldEntity targetClass = null;
ClassOldEntity sourceClass = null;
for (final ClassOldEntity classEntity : classEntities) {
final double contextualDistance = classEntity.getRelevantProperties().getContextualVector().size() == 0 ? 1 : getContextualDistance(entity, classEntity);
final double conceptualDistance = getConceptualDistance(entity, classEntity);
final double distance = 0.55 * conceptualDistance + 0.45 * contextualDistance;
if (classEntity.getName().equals(entity.getClassName())) {
sourceClass = classEntity;
distanceWithSourceClass = distance;
}
if (distance < minDistance) {
difference = minDistance - distance;
minDistance = distance;
targetClass = classEntity;
} else if (distance - minDistance < difference) {
difference = distance - minDistance;
}
}

if (targetClass == null) {
LOGGER.warn("targetClass is null for " + entity.getName());
return accumulator;
}
final String targetClassName = targetClass.getName();
double differenceWithSourceClass = distanceWithSourceClass - minDistance;
int numberOfMethodsInSourceClass = methodsByClass.get(sourceClass).size();
int numberOfMethodsInTargetClass = methodsByClass.getOrDefault(targetClass, Collections.emptySet()).size();
// considers amount of entities.
double sourceClassCoefficient = min(1, max(1.1 - 1.0 / (2 * numberOfMethodsInSourceClass * numberOfMethodsInSourceClass), 0));
double targetClassCoefficient = min(1, max(1.1 - 1.0 / (4 * numberOfMethodsInTargetClass * numberOfMethodsInTargetClass), 0));
double powerCoefficient = min(1, max(1.1 - 1.0 / (2 * entity.getRelevantProperties().getClasses().size()), 0));
double accuracy = (0.5 * distanceWithSourceClass + 0.1 * (1 - minDistance) + 0.4 * differenceWithSourceClass) * powerCoefficient * sourceClassCoefficient * targetClassCoefficient;
if (entity.getClassName().contains("Util") || entity.getClassName().contains("Factory") ||
entity.getClassName().contains("Builder")) {
if (accuracy > GOOD_ACCURACY_BOUND) {
accuracy /= 2;
}
}
if (entity.getName().contains("main")) {
accuracy /= 2;
}
accuracy = min(pow(accuracy / MAX_ACCURACY_BOUND, POWER_FOR_ACCURACY), 1);
if (differenceWithSourceClass != 0 && accuracy >= MIN_ACCURACY && !targetClassName.equals(entity.getClassName())) {
accumulator.add(Refactoring.createRefactoring(entity.getName(), targetClassName, accuracy, entity.isField(), context.getScope()));
}
return accumulator;
}

/**
* Measures contextual distance (a number in [0; 1]) between method and a class.
* It is cosine between two contextual vectors.
* If there is a null vector then cosine is 1.
*
* @param methodEntity method to calculate contextual distance.
* @param classEntity class to calculate contextual distance.
* @return contextual distance between the method and the class.
*/
private double getContextualDistance(@NotNull MethodOldEntity methodEntity, @NotNull ClassOldEntity classEntity) {
Map<String, Double> methodVector = methodEntity.getRelevantProperties().getContextualVector();
Map<String, Double> classVector = classEntity.getRelevantProperties().getContextualVector();
double methodVectorNorm = norm(methodVector);
double classVectorNorm = norm(classVector);
return methodVectorNorm == 0 || classVectorNorm == 0 ?
1 : 1 - dotProduct(methodVector, classVector) / (methodVectorNorm * classVectorNorm);
}

private double dotProduct(@NotNull Map<String, Double> vector1, @NotNull Map<String, Double> vector2) {
final double[] productValue = {0};
vector1.forEach((s, aDouble) -> productValue[0] += aDouble * vector2.getOrDefault(s, 0.0));
return productValue[0];
}

private double norm(@NotNull Map<String, Double> vector) {
return sqrt(dotProduct(vector, vector));
}

/**
* Measures conceptual distance (a number in [0; 1]) between method and a class.
* It is an average of distances between method and class methods.
* If there is no methods in a given class then distance is 1.
* @param methodEntity method to calculate conceptual distance.
* @param classEntity class to calculate conceptual distance.
* @return conceptual distance between the method and the class.
*/
private double getConceptualDistance(@NotNull MethodOldEntity methodEntity, @NotNull ClassOldEntity classEntity) {
int number = 0;
double sumOfDistances = 0;

if (methodsByClass.containsKey(classEntity)) {
for (MethodOldEntity methodEntityInClass : methodsByClass.get(classEntity)) {
if (!methodEntity.equals(methodEntityInClass)) {
sumOfDistances += getConceptualDistance(methodEntity, methodEntityInClass);
number++;
}
}
}

return number == 0 ? 1 : sumOfDistances / number;
}

/**
* Measures conceptual distance (a number in [0; 1]) between two methods.
* It is sizeOfIntersection(A1, A2) / sizeOfUnion(A1, A2), where Ai is a conceptual set of method.
* If A1 and A2 are empty then distance is 1.
* @param methodEntity1 method to calculate conceptual distance.
* @param methodEntity2 method to calculate conceptual distance.
* @return conceptual distance between two given methods.
*/
private double getConceptualDistance(@NotNull MethodOldEntity methodEntity1, @NotNull MethodOldEntity methodEntity2) {
Set<String> method1Classes = methodEntity1.getRelevantProperties().getClasses();
Set<String> method2Classes = methodEntity2.getRelevantProperties().getClasses();
int sizeOfIntersection = intersection(method1Classes, method2Classes).size();
int sizeOfUnion = union(method1Classes, method2Classes).size();
return (sizeOfUnion == 0) ? 1 : 1 - (double) sizeOfIntersection / sizeOfUnion;
}

@NotNull
private <T> Set<T> intersection(@NotNull Set<T> set1, @NotNull Set<T> set2) {
Set<T> intersection = new HashSet<>(set1);
intersection.retainAll(set2);
return intersection;
}

@NotNull
private <T> Set<T> union(@NotNull Set<T> set1, @NotNull Set<T> set2) {
Set<T> union = new HashSet<>(set1);
union.addAll(set2);
return union;
}

@NotNull
@Override
public List<Metric> requiredMetrics() {
return Collections.emptyList();
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -26,18 +26,15 @@
public abstract class OldEntity {
private static final VectorCalculator CLASS_ENTITY_CALCULATOR = new VectorCalculator()
.addMetricDependence(NumMethodsClassMetric.class)
.addMetricDependence(NumAttributesAddedMetric.class)
;
.addMetricDependence(NumAttributesAddedMetric.class);

private static final VectorCalculator METHOD_ENTITY_CALCULATOR = new VectorCalculator()
.addConstValue(0)
.addConstValue(0)
;
.addConstValue(0);

private static final VectorCalculator FIELD_ENTITY_CALCULATOR = new VectorCalculator()
.addConstValue(0)
.addConstValue(0)
;
.addConstValue(0);

private static final int DIMENSION = CLASS_ENTITY_CALCULATOR.getDimension();

Expand Down Expand Up @@ -68,6 +65,7 @@ public PsiElement getPsiElement() {
protected OldEntity(OldEntity original) {
relevantProperties = original.relevantProperties.copy();
name = original.name;
element = original.element;
vector = Arrays.copyOf(original.vector, original.vector.length);
isMovable = original.isMovable;
}
Expand Down Expand Up @@ -164,4 +162,4 @@ public boolean isMovable() {
abstract public OldEntity copy();

abstract public boolean isField();
}
}
Loading