CODE SEQUENCE BASED INTELLIGENT KEY CODE IDENTIFICATION METHOD AND RECORDING MEDIUM AND DEVICE FOR PERFORMING THE SAME

Information

  • Patent Application
  • 20220207296
  • Publication Number
    20220207296
  • Date Filed
    January 28, 2021
    3 years ago
  • Date Published
    June 30, 2022
    2 years ago
Abstract
A code sequence based intelligent key code identification method includes extracting Smali code sequence by decompiling an application, vectorizing the extracted Smali code sequence to construct a training dataset, training a deep learning model with the vectorized Smali code sequence to generate a classifier, generating a category classification result using Smali code sequence of a target application as input of the classifier, and identifying and providing important Smali code sequence from which the classification result of the target application is derived. Accordingly, it is possible to objectively evaluate the application using Smali code sequence of the application being actually run.
Description
TECHNICAL FIELD

The present disclosure relates to a code sequence based intelligent key code identification method and a recording medium and a device for performing the same, and more particularly, to technology that objectively evaluates an application using Smali code sequence based on source code of the application being actually run.


BACKGROUND ART

Most of application evaluation techniques are performed based on permission, description, and user review. The permission and description are requested and written in a developer's subjective point of view, and thus are less objective and it is difficult to expect accurate evaluation.


Additionally, in the case of permission, when the developer who does not accurately understand the meaning and influence of the corresponding permission requests the permission too much, in many cases, it greatly affects evaluation irrespective of the actual execution of the application.


The user review is also written in a user's subjective points of view, and thus is less objective, and the existing techniques do not accurately reflect the actual execution of the application.


Recently, there have emerged evaluation methods based on application programming interface (API) responsible for the actual execution of applications, but deep learning and machine learning used are relatively simple and cannot identify the actual usage relationship of API, failing to effectively use the features of API in the evaluation.


Additionally, when classifying applications, applications can be only simply classified into benign applications or malicious applications.


RELATED LITERATURES
Patent Literatures

(Patent Literature 1) KR 10-2020-0096766 A


(Patent Literature 2) KR 10-2144044 B1


(Patent Literature 3) KR 10-1477050 B1


DISCLOSURE
Technical Problem

In view of this circumstance, the present disclosure is directed to providing a code sequence based intelligent key code identification method.


The present disclosure is further directed to providing a recording medium having recorded thereon a computer program for performing the code sequence based intelligent key code identification method.


The present disclosure is further directed to providing a device for performing the code sequence based intelligent key code identification method.


Technical Solution

A code sequence based intelligent key code identification method according to an embodiment for achieving the above-described object of the present disclosure includes extracting Smali code sequence by decompiling an application, vectorizing the extracted Smali code sequence to construct a training dataset, training a deep learning model with the vectorized Smali code sequence to generate a classifier, generating a category classification result using Smali code sequence of a target application as input of the classifier, and identifying and providing important Smali code sequence from which the classification result of the target application is derived.


In an embodiment of the present disclosure, constructing the training dataset may include constructing the training dataset using all the extracted Smali code sequences, and vectorizing the training dataset to use as input of the deep learning model.


In an embodiment of the present disclosure, extracting the Smali code sequence may include extracting Smali code by decompiling the application for each category, and converting the Smali code to Smali code sequence.


In an embodiment of the present disclosure, generating the category classification result may include classifying as a category having a highest probability among categories that will be classified for the target application.


In an embodiment of the present disclosure, identifying and providing the important Smali code sequence may use Local Interpretable Model-Agnostic Explanation (LIME) which is an algorithm that provides description of the deep learning model.


A computer-readable storage medium according to an embodiment for achieving another object of the present disclosure described above has recorded thereon a computer program for performing the code sequence based intelligent key code identification method.


A code sequence based intelligent key code identification device according to an embodiment for achieving still another object of the present disclosure described above includes a sequence extraction unit to extract Smali code sequence by decompiling an application, a vectorization unit to vectorize the extracted Smali code sequence to construct a training dataset, a learning unit to train a deep learning model with the vectorized Smali code sequence to generate a classifier, a classification unit to generate a category classification result using Smali code sequence of a target application as input of the classifier, and an identification unit to identify and provide important Smali code sequence from which the classification result of the target application is derived.


In an embodiment of the present disclosure, the vectorization unit may include a dataset generation unit to construct the training dataset using all the extracted Smali code sequences, and an embedding unit to vectorize the training dataset to use as input of the deep learning model.


In an embodiment of the present disclosure, the sequence extraction unit may include a Smali code unit to extract Smali code by decompiling the application for each category, and a Smali sequence conversion unit to convert the Smali code to Smali code sequence.


In an embodiment of the present disclosure, the classification unit may classify as a category having a highest probability among categories that will be classified for the target application.


In an embodiment of the present disclosure, the identification unit may use Local


Interpretable Model-Agnostic Explanation (LIME) which is an algorithm that provides description of the deep learning model.


Advantageous Effects

According to the code sequence based intelligent key code identification method, since


Smali code based on source code of an application being actually run is extracted and converted to Smali code sequence, the actual execution flow of the application is identified, and important Smali code sequence of the application is identified along with classifying the category of the application using a deep learning model. Accordingly, security is strengthened by identifying important Smali code sequence based on the actual execution, so it is expected to prevent damage caused by malicious behavior.





DESCRIPTION OF DRAWINGS


FIG. 1 is a block diagram of a code sequence based intelligent key code identification device according to an embodiment of the present disclosure.



FIG. 2 is a diagram illustrating a Smali code sequence extraction process of a sequence extraction unit of FIG. 1.



FIG. 3 is a block diagram of a vectorization unit of FIG. 1.



FIG. 4 is a flowchart of a code sequence based intelligent key code identification method according to an embodiment of the present disclosure.





BEST MODE

The following detailed description of the present disclosure is made with reference to the accompanying drawings, in which particular embodiments for practicing the present disclosure are shown for illustration purposes. These embodiments are described in sufficiently detail for those skilled in the art to practice the present disclosure. It should be understood that various embodiments of the present disclosure are different but do not need to be mutually exclusive. For example, particular shapes, structures and features described herein in connection with one embodiment may be embodied in other embodiment without departing from the spirit and scope of the present disclosure. It should be further understood that changes may be made to the positions or placement of individual elements in each disclosed embodiment without departing from the spirit and scope of the present disclosure. Accordingly, the following detailed description is not intended to be taken in limiting senses, and the scope of the present disclosure, if appropriately described, is only defined by the appended claims along with the full scope of equivalents to which such claims are entitled. In the drawings, similar reference signs denote same or similar functions in many aspects.


Hereinafter, the preferred embodiments of the present disclosure will be described in more detail with reference to the accompanying drawings.



FIG. 1 is a block diagram of a code sequence based intelligent key code identification device according to an embodiment of the present disclosure.


The code sequence based intelligent key code identification device 10 according to the present disclosure (hereinafter, the device) evaluates an application by identifying important Smali code sequence of the application based on Smali code of the application. Since Smali code is based on source code of the application being actually run, when Smali code sequence converted from Smali code is used, the actual execution flow of the application is reflected, which makes it possible to objectively evaluate the application.


Referring to FIG. 1, the device 10 according to the present disclosure includes a sequence extraction unit 100, a vectorization unit 200, a learning unit 300, a classification unit 400 and an identification unit 500.


The device 10 of the present disclosure may run software (application) for performing code sequence based intelligent key code identification thereon, and the sequence extraction unit 100, the vectorization unit 200, the learning unit 300, the classification unit 400 and the identification unit 500 may be controlled by the software for performing the code sequence based intelligent key code identification running on the device 10.


The device 10 may be a separate terminal or modules of the terminal. Additionally, the sequence extraction unit 100, the vectorization unit 200, the learning unit 300, the classification unit 400 and the identification unit 500 may be formed as an integrated module or at least one module. However, to the contrary, each element may be formed as a separate module.


The device 10 may be mobile or fixed. The device 10 may be in the form of a server or an engine, and may be interchangeably used with a device, an apparatus, a terminal, user equipment (UE), a mobile station (MS), a wireless device and a handheld device.


The device 10 may execute or create a variety of software based on an Operation System (OS), namely, a system. The OS is a system program for enabling software to use the hardware of the device, and may include mobile computer OS including Android OS, iOS, Windows Mobile OS, Bada OS, Symbian OS and Blackberry OS and computer OS including Windows family, Linux family, Unix family, MAC, AIX, and HP-UX.


The sequence extraction unit 100 extracts Smali code sequence by decompiling the application. The sequence extraction unit 100 extracts Smali (An Assembler/Disassembler for Android's dex format) code by decompiling the application for each category of the application. The Smali code contains the details and functions of the application. Subsequently, the Smali code is converted to Smali code sequence to identify the sequence of execution.


Referring to FIG. 2, a detailed process of extracting Smali code is as follows. When APK is decompiled, there are Signature, AndroidManifest.xml, Resources, classes.dex in Android package. Among them, classes.dex file may be extracted. The extracted DEX file has 9 fields, header, string_ids, type_ids, proto_ids, field_ids, method_ids, class_def, data, link_data.


Here, class_def_item field has class.def field class information, and class_data_off field in class_def_item points to class_data_item. Class_data_item has method information and is composed of encoded_method which locates code_item using code_off field inside, and actual bytecodes are present in insns field.


The extracted bytecode is converted to Smali code to allow human to read it. The code sequence is extracted using the converted Smali code.


As an embodiment, Smali code sequence may be extracted using an APK analysis tool, Androguard. Using APK as input of Androguard, classes.dex file is extracted, method is extracted to return bytecode, and the byte code is converted to Smali code. An example of each of the extracted Smali code and Smali code sequence is as shown in the following Table 1.









TABLE 1





Smali Code















invoke_ Ljava/lang/Object;-><init>( )V return_ move_ invoke


Landroid/arch/core/executor/ArchTaskExecutor;->postToMainThread(Ljava/lang/Run


nable;)V return_ move_ invoke_


Landroid/arch/core/executor/ArchTaskExecutor;->executeOnDiskIO(Ljava/lang/Runn


able;)V return_const_ “9” new_ “[I” fill_ sput_ Landroid/support/a/a/a;->a[I const_


“8” new_ “[I” fill_ sput_ Landroid/support/a/a/a;->b[I const_ “13” new_ “I” fill_ sput_


Landroid/support/a/a/a;->c[I const_ “2” new_ “[I” fill_ sput_


Landroid/support/a/a/a;->d[I const_ “1” new_ “[I” const_ “0” constv3 “16843161”


aputv3 sput_ Landroid/support/a/a/a;->e[I new_ “[I” fill_ sput_


Landroid/support/a/a/a;->f[I return_ fill_ (x03, x00, x01, x01, x21, x01, x01, x01,


x55, x01, x01, x01, x59, x01, x01, x01, x1f, x03, x01, x01, xea, x03, x01, x01, xfb,


x03, x01, x01, x02, x04, x01, x01, x03, x04, x01, x01) fill_ (x03, x00, x01, x01, xb5,


x01, x01, x01, xb6, x01, x01, x01, x24, x03, x01, x01, x25, x03, x01, x01, x26, x03,


x01, x01, x5a, x04, x01, x01, x5b, x04, x01, x01) fill_ (x03, x00, x01, x01, x04, x04,


x01, x01, x05, x04, x01, x01, x06, x04, x01, x01, x07, x04, x01, x01, x08, x04, x01,


x01, x09, x04, x01, x01, x0a, x04, x01, x01, x0b, x04, x01, x01, x0c, x04, x01, x01,


x0d, x04, x01, x01, xcb, x04, x01, x01, xcc, x04, x01, x01) fill_ (x03, x00, x01, x01,


x05, x04, x01, x01) fill_ (x03, x00, x01, x01, xcd, x01, x01, x01) iput_


Landroid/support/a/a/b$1;->aLandroid/support/a/a/b; invoke_


Ljava/lang/Object;-><init>( )V return_ iget_


Landroid/support/a/a/b$1;->aLandroid/support/a/a/b; invoke_


Landroid/support/a/a/b;->invalidateSelf( )V return_


...
















TABLE 2





Small Code Sequence







‘igetv0, v1,


Lafu/org/checkerframework/checker/formatter/FormatUtil$ExcessiveOrMissingForm


atArgumentException;->expectedI’, ‘returnv0’


‘invoke-directv0, Ljava/lang/Object;-><init>( )V’, ‘return-void’


‘invoke-directv0, Ljava/lang/Exception;-><init>( )V’, ‘iput-objectv1, v0,


Lafu/org/checkerframework/checker/regex/RegexUtil$CheckedPatternSyntaxExcepti


on;->pseLjava/util/regex/PatternSyntaxException;’, ‘return-void’


‘invoke-staticv2,


Ljava/util/regex/Pattern;->compile(Ljava/lang/String;)Ljava/util/regex/Pattern;’,


‘move-result-objectv0’, ‘invoke-staticv0,


Lafu/org/checkerframework/checker/regex/RegexUtil;->getGroupCount(Ljava/util/re


gex/Pattern;)I’, ‘move-resultv0’, ‘if-gev0, v3, +00dh’, ‘new-instancev1,


Ljava/util/regex/PatternSyntaxException;’, ‘invoke-staticv2, v3, v0,


Lafu/org/checkerframework/checker/regex/RegexUtil;->regexErrorMessage(Ljava/lan


g/String;II)Ljava/lang/String;’,‘move-result-objectv3’,‘const/4v0, −1’,


‘invoke-directv1, v3, v2, v0,


Ljava/util/regex/PatternSyntaxException;-><init>(Ljava/lang/String;Ljava/lang/String


;I)V’, ‘return-objectv1’, ‘const/4v2, 0’, ‘return-objectv2’, ‘move-exceptionv2’,


‘return-objectv2’


‘const/4v0, 0’, ‘invoke-staticv1, v0,


Lafu/plume/RegexUtil;->isRegex(Ljava/lang/String;I)Z’, ‘move-resultv1’, ‘returnv1’


‘const-stringv0, “%(\\d+\\$)?([-#+0, (\\<]*)?(\\d+)?(\\.\\d+)?([tT])?([a-zA-Z%])”’,


‘invoke-staticv0,


Ljava/util/regex/Pattern;->compile(Ljava/lang/String;)Ljava/util/regex/Pattern;’,


‘move-result-objectv0’, ‘sput-objectv0,


Lafu/org/checkerframework/checker/formatter/FormatUtil;->fsPatternLjava/util/regex


/Pattern;’, ‘return-void’]


[‘invoke-directv0, Ljava/lang/Object;-><init>( )V’, ‘return-void’









The vectorization unit 200 vectorizes the extracted Smali code sequence to construct a training dataset. Referring to FIG. 3, the vectorization unit 200 may include a dataset generation unit 210 and an embedding unit 230.


The vectorization unit 200 constructs the training dataset by pre-processing opcodes, parameters, strings, and memory address of all the extracted Smali code sequences. To construct the training dataset, the training dataset is generated by labeling for each application category of the extracted Smali code sequences.


As an embodiment, for a total of 300 applications, the training dataset may be built in 6 categories (music_and_audio, education, game, beauty, tools, weather) every 50 applications.


The embedding unit 230 vectorizes the training dataset for use as input of a deep learning model. Dictionarization is performed by converting the generated training dataset into numeric form. Subsequently, embedding is performed by converting the dictionarized Smali code sequences into dense vectors.


As an embodiment, 4,386,662 Smali code sequences may be incorporated into a dictionary and vectorized. An example of the resulting dictionary is as shown in the following Table.










TABLE 3





No.
Smali code Sequence
















0
invoke_ Ljava/lang/Object;−><init>( )V ...


1
move_ invoke_ Landroid/arch/core/executor ...


2
const_ “9” new_ “[I” fill_ sput_ Landroid/support ...


3
iput_ Landroid/support/a/a/b$1;−>aLandroid ...


4
new_ Ljava/lang/ref/WeakReference; invoke_ ...


5
igetv_ Lafu/org/checkerframework/checker ...


...
...


4386656
iget_ Lcom/bumptech/glide/reques ...


4386657
check_ Landroid/graphics/drawable/Draw ...


4386658
invoke_ Lcom/google/android/gms/ ...


4386659
check_ Lcom/google/android/gms/common/...


4386660
invoke_ Landroidx/appcompat/app/AppC ...


4386661
new_ Lcom/google/android/material/ ...









The learning unit 300 trains the deep learning model with the vectorized Smali code sequences to generate a classifier. In other words, the learning unit 300 inputs the vectorized training dataset to the deep learning model to generate a classifier, and trains a prediction model using a CNN deep learning algorithm.


As an embodiment, the number of Convolution Layers used in the prediction model is 4, and an activation function uses ReLu. Only a particular feature having a large value by Max pooling is used. The prediction model is trained with a total of 140,613,114 parameters by reducing the number of training weights using Gated Recurrent Unit (GRU). The model validation accuracy obtained by the trained prediction model is measured as 0.8361.


The classification unit 400 generates a category classification result using Smali code sequence of a target application as input of the classifier. The classification unit 400 may classify as a category having the highest probability among categories that will be classified for the target application.


After training the prediction model, Smali code sequence having the greatest influence in each category may be extracted using Local Interpretable Model-agnostic Explanations (LIME) which is a deep learning visualization technique in the model. An equation for calculating LIME is given as the following Equation 1.













ξ


(
x
)


=





argmin

g





ϵ





G











(

f
,
g
,

π
x


)



+

Ω


(
g
)









=




argmin

g





ϵ





G


[





i
=
1

N





e

-



D


(

x
,

x
i


)


2


σ
2






(


f


(

z
i

)


-

g


(


(

z


)

i

)



)


2


+

∞𝕀




β


0

>
K



]








[

Equation





1

]







Here, f is a complex prediction model, and g is a simple model used to locally compare.


X is data, and β is a coefficient in the model g term and defined as ∥β∥0jj|0.


For example, in a target APK, when Smali code is extracted, converted to training data, and used as input of the prediction model, in case where the prediction model classifies the target APK as music and audio, important Smali code represents that the target APK is classified as music and audio using LIME in the prediction model.


As an embodiment, important Smali code sequences for each category are as shown in the following Tables 4 to 6.










TABLE 4





music_and_audio
game







sget
invoke


Landroid/os/Build$VERSION;−>SDK_IN
Lkotlin/random/Random$Default;−><init>


TI const_ “21” if_ invoke
( )V return


Landroid/widget/ImageView;−>setImage


Matrix(Landroid/graphics/Matrix;)V


goto_ sget


Landroidx/transition/ImageViewUtils;−>s


AnimateTransformMethodLjava/lang/refl


ect/Method; if_ const_ “1” new


[Ljava/lang/Object; const_ “0” aput


invoke


Ljava/lang/reflect/Method;−>invoke(Ljava


/lang/Object;[Ljava/lang/Object;)Ljava/la


ng/Object; goto_ move_ new


Ljava/lang/RuntimeException; invoke


Ljava/lang/reflect/InvocationTargetExcept


ion;−>getCause( )Ljava/lang/Throwable;


move_ invoke


Ljava/lang/RuntimeException;−><init>(Lj


ava/lang/Throwable;)V throw_ return


iget
sget


Landroidx/renderscript/RenderScript;−>m
Landroid/os/Build$VERSION;−>SDK_IN


Element_DOUBLE_3Landroidx/renderscr
TI const_ “21” if_ invoke


ipt/Element; if_ sget
Landroid/widget/ImageView;−>setImage


Landroidx/renderscript/Element$DataTyp
Matrix(Landroid/graphics/Matrix;)V


e;−>FLOAT_64Landroidx/renderscript/El
goto_ sget


ement$DataType; const_ “3” invoke
Landroidx/transition/ImageViewUtils;−>s


Landroidx/renderscript/Element;−>createV
AnimateTransformMethodLjava/lang/refle


ector(Landroidx/renderscript/RenderScript
ct/Method; if_ const_ “1” new


;Landroidx/renderscript/Element$DataTyp
[Ljava/lang/Object; const_ “0” aput


e;I)Landroidx/renderscript/Element;
invoke


move_ iput
Ljava/lang/reflect/Method;−>invoke(Ljava


Landroidx/renderscript/RenderScript;−>m
/lang/Object;[Ljava/lang/Object;)Ljava/lan


Element_DOUBLE_3Landroidx/renderscr
g/Object; goto_ move_ new


ipt/Element; iget
Ljava/lang/RuntimeException; invoke


Landroidx/renderscript/RenderScript;−>m
Ljava/lang/reflect/InvocationTargetExcept


Element_DOUBLE_3Landroidx/renderscr
ion;−>getCause( )Ljava/lang/Throwable;


ipt/Element; return
move_ invoke



Ljava/lang/RuntimeException;−><init>(Lj



ava/lang/Throwable;)V throw_ return


invoke_ Ljava/lang/Object;−><init>( )V
iget


new_ Ljava/lang/ThreadLocal; invoke
Landroidx/renderscript/RenderScript;−>m


Ljava/lang/ThreadLocal;−><init>( )V iput
Element_DOUBLE_3Landroidx/renderscr


Lbolts/BoltsExecutors$ImmediateExecuto
ipt/Element; if_ sget


r;−>executionDepthLjava/lang/ThreadLoc
Landroidx/renderscript/Element$DataTyp


al; return
e;−>FLOAT_64Landroidx/renderscript/Ele



ment$DataType; const_ “3” invoke



Landroidx/renderscript/Element;−>createV



ector(Landroidx/renderscript/RenderScript



;Landroidx/renderscript/Element$DataTyp



e;I)Landroidx/renderscript/Element;



move_ iput



Landroidx/renderscript/RenderScript;−>m



Element_DOUBLE_3Landroidx/renderscr



ipt/Element; iget



Landroidx/renderscript/RenderScript;−>m



Element_DOUBLE_3Landroidx/renderscr



ipt/Element; return


iget_ Lokhttp3/Cookie;−>secureZ return
iget



Landroidx/documentfile/provider/SingleD



ocumentFile;−>mContextLandroid/content



/Context; iget



Landroidx/documentfile/provider/SingleD



ocumentFile;−>mUriLandroid/net/Uri;



invoke



Landroidx/documentfile/provider/Docume



ntsContractApi19;−>isDirectory(Landroid/



content/Context;Landroid/net/Uri;)Z



move_ return


invoke
invoke


Landroidx/appcompat/app/AppCompatDe
Landroidx/appcompat/app/AppCompatDel


legateImpl;−>ensureSubDecor( )V iget
egateImpl;−>ensureSubDecor( )V iget


Landroidx/appcompat/app/AppCompatDe
Landroidx/appcompat/app/AppCompatDel


legateImpl;−>mSubDecorLandroid/view/V
egateImpl;−>mSubDecorLandroid/view/Vi


iewGroup; constv1 “16908290” invoke
ewGroup; constv1 “16908290” invoke


Landroid/view/ViewGroup;−>findViewBy
Landroid/view/ViewGroup;−>findViewBy


Id(I)Landroid/view/View; move_ check
Id(I)Landroid/view/View; move_ check


Landroid/view/ViewGroup; invoke
Landroid/view/ViewGroup; invoke


Landroid/view/ViewGroup;−>removeAllV
Landroid/view/ViewGroup;−>removeAllV


iews( )V iget
iews( )V iget


Landroidx/appcompat/app/AppCompatDe
Landroidx/appcompat/app/AppCompatDel


legateImpl;−>mContextLandroid/content/
egateImpl;−>mContextLandroid/content/C


Context; invoke
ontext; invoke


Landroid/view/LayoutInflater;−>from(Lan
Landroid/view/LayoutInflater;−>from(Lan


droid/content/Context;)Landroid/view/Lay
droid/content/Context;)Landroid/view/Lay


outInflater; move_ invoke
outInflater; move_ invoke


Landroid/view/LayoutInflater;−>inflate(IL
Landroid/view/LayoutInflater;−>inflate(IL


android/view/ViewGroup;)Landroid/view/
android/view/ViewGroup;)Landroid/view/


View; iget
View; iget


Landroidx/appcompat/app/AppCompatDe
Landroidx/appcompat/app/AppCompatDel


legateImpl;−>mAppCompatWindowCallb
egateImpl;−>mAppCompatWindowCallba


ackLandroidx/appcompat/app/AppCompat
ckLandroidx/appcompat/app/AppCompat


DelegateImpl$AppCompatWindowCallba
DelegateImpl$AppCompatWindowCallba


ck; invoke
ck; invoke


Landroidx/appcompat/app/AppCompatDe
Landroidx/appcompat/app/AppCompatDel


legateImpl$AppCompatWindowCallback;
egateImpl$AppCompatWindowCallback;−


−>getWrapped( )Landroid/view/Window$
>getWrapped( )Landroid/view/Window$C


Callback; move_ invoke
allback; move_ invoke


Landroid/view/Window$Callback;−>onCo
Landroid/view/Window$Callback;−>onCo


ntentChanged( )V return
ntentChanged( )V return



iget



Landroidx/transition/Visibility$Disappear



Listener;−>mSuppressLayoutZ if_ iget



Landroidx/transition/Visibility$Disappear



Listener;−>mLayoutSuppressedZ if_ iget



Landroidx/transition/Visibility$Disappear



Listener;−>mParentLandroid/view/ViewGr



oup; if_ iput



Landroidx/transition/Visibility$Disappear



Listener;−>mLayoutSuppressedZ invoke



Landroidx/transition/ViewGroupUtils;−>s



uppressLayout(Landroid/view/ViewGroup



;Z)V return

















TABLE 5





education
beauty







invoke
invoke_ Ljava/lang/Object;−><init>( )V


Landroidx/loader/content/ModernAsyncT
new_ Ljava/lang/ThreadLocal; invoke


ask;−>isCancelled( )Z move_ if_ invoke
Ljava/lang/ThreadLocal;−><init>( )V iput


Landroidx/loader/content/ModernAsyncT
Lbolts/BoltsExecutors$ImmediateExecuto


ask;−>onCancelled(Ljava/lang/Object;)V
r;−>executionDepthLjava/lang/ThreadLoc


goto_ invoke
al; return


Landroidx/loader/content/ModernAsyncT


ask;−>onPostExecute(Ljava/lang/Object;)


V sget


Landroidx/loader/content/ModernAsyncT


ask$Status;−>FINISHEDLandroidx/loader


/content/ModernAsyncTask$Status; iput


Landroidx/loader/content/ModernAsyncT


ask;−>mStatusLandroidx/loader/content/M


odemAsyncTask$Status; return


(‘iput
iput


Lcom/airbnb/lottie/LottieDrawable;−>perf
Lcom/airbnb/lottie/LottieDrawable;−>perf


ormanceTrackingEnabledZ iget
ormanceTrackingEnabledZ iget


Lcom/airbnb/lottie/LottieDrawable;−>com
Lcom/airbnb/lottie/LottieDrawable;−>com


positionLcom/airbnb/lottie/LottieComposi
positionLcom/airbnb/lottie/LottieComposi


tion; if_ invoke
tion; if_ invoke


Lcom/airbnb/lottie/LottieComposition;−>s
Lcom/airbnb/lottie/LottieComposition;−>s


etPerformanceTrackingEnabled(Z)V
etPerformanceTrackingEnabled(Z)V


return
return


‘invoke_ Ljava/lang/Object;−><init>( )V
iget_ Lokhttp3/Cookie;−>secureZ return


new_ Ljava/lang/ThreadLocal; invoke


Ljava/lang/ThreadLocal;−><init>( )V iput


Lbolts/BoltsExecutors$ImmediateExecuto


r;−>executionDepthLjava/lang/ThreadLoc


al; return


iget_ Lokhttp3/Cookie;−>secureZ return
invoke



Landroidx/appcompat/app/AppCompatDel



egateImpl;−>ensureSubDecor( )V iget



Landroidx/appcompat/app/AppCompatDelegat



eImpl;−>mSubDecorLandroid/view/Vi



ewGroup; constv1 “16908290” invoke



Landroid/view/ViewGroup;−>findViewBy



Id(I)Landroid/view/View; move_ check



Landroid/view/ViewGroup; invoke



Landroid/view/ViewGroup;−>removeAllV



iews( )V iget



Landroidx/appcompat/app/AppCompatDel



egateImpl;−>mContextLandroid/content/C



ontext; invoke



Landroid/view/LayoutInflater;−>from(Lan



droid/content/Context;)Landroid/view/Lay



outInflater; move_ invoke



Landroid/view/LayoutInflater;−>inflate(IL



android/view/ViewGroup;)Landroid/view/



View; iget



Landroidx/appcompat/app/AppCompatDel



egateImpl;−>mAppCompatWindowCallba



ckLandroidx/appcompat/app/AppCompat



DelegateImpl$AppCompatWindowCallba



ck; invoke



Landroidx/appcompat/app/AppCompatDel



egateImpl$AppCompatWindowCallback;−



>getWrapped( )Landroid/view/Window$C



allback; move_ invoke



Landroid/view/Window$Callback;−>onCo



ntentChanged( )V return


const_ “0” move_ move_ move_ move
iget


move_ invoke_ invoke
Landroidx/renderscript/RenderScript;−>m


Lio/reactivex/Flowable;−>window(JLjava/
Element_DOUBLE_3Landroidx/renderscr


util/concurrent/TimeUnit;Lio/reactivex/Sc
ipt/Element; if_ sget


heduler;JZ)Lio/reactivex/Flowable;
Landroidx/renderscript/Element$DataTyp


move_ return
e;−>FLOAT_64Landroidx/renderscript/Ele



ment$DataType; const_ “3” invoke



Landroidx/renderscript/Element;−>createV



ector(Landroidx/renderscript/RenderScript



;Landroidx/renderscript/Element$DataType;I)L



androidx/renderscript/Element;



move_ iput



Landroidx/renderscript/RenderScript;−>m



Element_DOUBLE_3Landroidx/renderscr



ipt/Element; iget



Landroidx/renderscript/RenderScript;−>m



Element_DOUBLE_3Landroidx/renderscr



ipt/Element; return


new


Lio/reactivex/internal/operators/observabl


e/ObservableFromPublisher; invoke


Lio/reactivex/internal/operators/observabl


e/ObservableFromPublisher;−><init>(Lorg


/reactivestreams/Publisher;)V invoke


Lio/reactivex/plugins/RxJavaPlugins;−>on


Assembly(Lio/reactivex/Observable;)Lio/r


eactivex/Observable; move_ return

















TABLE 6





weather
tools







invoke
(‘sgetv0


Landroidx/loader/content/ModernAsyncT
Landroid/os/Build$VERSION;−>SDK_IN


ask;−>isCancelled( )Z move_ if_ invoke
TI const_ “21” if_ invoke


Landroidx/loader/content/ModernAsyncT
Landroid/widget/ImageView;−>setImage


ask;−>onCancelled(Ljava/lang/Object;)V
Matrix(Landroid/graphics/Matrix;)V


goto_ invoke
goto_ sget


Landroidx/loader/content/ModernAsyncT
Landroidx/transition/ImageViewUtils;−>s


ask;−>onPostExecute(Ljava/lang/Object;)
AnimateTransformMethodLjava/lang/refle


V sget
ct/Method; if_ const_ “1” new


Landroidx/loader/content/ModernAsyncT
[Ljava/lang/Object; const_ “0” aput


ask$Status;−>FINISHEDLandroidx/loader
invoke


/content/ModernAsyncTask$Status; iput
Ljava/lang/reflect/Method;−>invoke(Ljava


Landroidx/loader/content/ModernAsyncT
/lang/Object;[Ljava/lang/Object;)Ljava/lan


ask;-mStatusLandroidx/loader/content/M
g/Object; goto_ move_ new


odemAsyncTask$Status; return
Ljava/lang/RuntimeException; invoke



Ljava/lang/reflect/InvocationTargetExcept



ion;−>getCause( )Ljava/lang/Throwable;



move_ invoke



Ljava/lang/RuntimeException;−><init>(Lj



ava/lang/Throwable;)V throw_ return


sgetv0
iget


Landroid/os/Build$VERSION;−>SDK_IN
Landroidx/renderscript/RenderScript;−>m


TI const_ “21” if_ invoke
Element_DOUBLE_3Landroidx/renderscr


Landroid/widget/ImageView;−>setImage
ipt/Element; if_ sget


Matrix(Landroid/graphics/Matrix;)V
Landroidx/renderscript/Element$DataTyp


goto_ sget
e;−>FLOAT_64Landroidx/renderscript/Ele


Landroidx/transition/ImageViewUtils;−>s
ment$DataType; const_ “3” invoke


AnimateTransformMethodLjava/lang/refl
Landroidx/renderscript/Element;−>createV


ect/Method; if_ const_ “1” new
ector(Landroidx/renderscript/RenderScript


[Ljava/lang/Object; const_ “0” aput
;Landroidx/renderscript/Element$DataTyp


invoke
e;I)Landroidx/renderscript/Element;


Ljava/lang/reflect/Method;−>invoke(Ljava
move_ iput


/lang/Object;[Ljava/lang/Object;)Ljava/la
Landroidx/renderscript/RenderScript;−>m


ng/Object; goto_ move_ new
Element_DOUBLE_3Landroidx/renderscr


Ljava/lang/RuntimeException; invoke
ipt/Element; iget


Ljava/lang/reflect/InvocationTargetExcept
Landroidx/renderscript/RenderScript;−>m


ion;−>getCause( )Ljava/lang/Throwable;
Element_DOUBLE_3Landroidx/renderscr


move_ invoke
ipt/Element; return


Ljava/lang/RuntimeException;−><init>(Lj


ava/lang/Throwable;)V throw_ return


iget
iget


Landroidx/renderscript/RenderScript;−>m
Landroidx/documentfile/provider/SingleD


Element_DOUBLE_3Landroidx/renderscr
ocumentFile;−>mContextLandroid/content


ipt/Element; if_ sget
/Context; iget


Landroidx/renderscript/Element$DataTyp
Landroidx/documentfile/provider/SingleD


e;−>FLOAT_64Landroidx/renderscript/El
ocumentFile;−>mUriLandroid/net/Uri;


ement$DataType; const_ “3” invoke
invoke


Landroidx/renderscript/Element;−>createV
Landroidx/documentfile/provider/Docume


ector(Landroidx/renderscript/RenderScript
ntsContractApi19;−>isDirectory(Landroid/


;Landroidx/renderscript/Element$DataTyp
content/Context;Landroid/net/Uri;)Z


e;I)Landroidx/renderscript/Element;
move_ return


move_ iput


Landroidx/renderscript/RenderScript;−>m


Element_DOUBLE_3Landroidx/renderscr


ipt/Element; iget


Landroidx/renderscript/RenderScript;−>m


Element_DOUBLE_3Landroidx/renderscr


ipt/Element; return


iget_ Lokhttp3/Cookie;−>secureZ return
invoke



Landroidx/appcompat/app/AppCompatDel



egateImpl;−>ensureSubDecor( )V iget



Landroidx/appcompat/app/AppCompatDel



egateImpl;−>mSubDecorLandroid/view/Vi



ewGroup; constv1 “16908290” invoke



Landroid/view/ViewGroup;−>findViewBy



Id(I)Landroid/view/View; move_ check



Landroid/view/ViewGroup; invoke



Landroid/view/ViewGroup;−>removeAllV



iews( )V iget



Landroidx/appcompat/app/AppCompatDel



egateImpl;−>mContextLandroid/content/C



ontext; invoke



Landroid/view/LayoutInflater;−>from(Lan



droid/content/Context;)Landroid/view/Lay



outInflater; move_ invoke



Landroid/view/LayoutInflater;−>inflate(IL



android/view/ViewGroup;)Landroid/view/



View; iget



Landroidx/appcompat/app/AppCompatDel



egateImpl;−>mAppCompatWindowCallba



ckLandroidx/appcompat/app/AppCompat



DelegateImpl$AppCompatWindowCallba



ck; invoke



Landroidx/appcompat/app/AppCompatDel



egateImpl$AppCompatWindowCallback;−



>getWrapped( )Landroid/view/Window$C



allback; move_ invoke



Landroid/view/Window$Callback;−>onCo



ntentChanged( )V return


invoke
iget


Landroidx/appcompat/app/AppCompatDe
Landroidx/transition/Visibility$Disappear


legateImpl;−>ensureSubDecor( )V iget
Listener;−>mSuppressLayoutZ if_ iget


Landroidx/appcompat/app/AppCompatDe
Landroidx/transition/Visibility$Disappear


legateImpl;−>mSubDecorLandroid/view/V
Listener;−>mLayoutSuppressedZ if_ iget


iewGroup; constv1 “16908290” invoke
Landroidx/transition/Visibility$Disappear


Landroid/view/ViewGroup;−>findViewBy
Listener;−>mParentLandroid/view/ViewGr


Id(I)Landroid/view/View; move_ check
oup; if_ iput


Landroid/view/ViewGroup; invoke
Landroidx/transition/Visibility$Disappear


Landroid/view/ViewGroup;−>removeAllV
Listener;−>mLayoutSuppressedZ invoke


iews( )V iget
Landroidx/transition/ViewGroupUtils;−>s


Landroidx/appcompat/app/AppCompatDe
uppressLayout(Landroid/view/ViewGroup


legateImpl;−>mContextLandroid/content/
;Z)V return


Context; invoke


Landroid/view/LayoutInflater;−>from(Lan


droid/content/Context;)Landroid/view/Lay


outInflater; move_ invoke


Landroid/view/LayoutInflater;−>inflate(IL


android/view/ViewGroup;)Landroid/view/


View; iget


Landroidx/appcompat/app/AppCompatDe


legateImpl;−>mAppCompatWindowCallb


ackLandroidx/appcompat/app/AppCompat


DelegateImpl$AppCompatWindowCallba


ck; invoke


Landroidx/appcompat/app/AppCompatDe


legateImpl$AppCompatWindowCallback;


−>getWrapped( )Landroid/view/Window$


Callback; move_ invoke


Landroid/view/Window$Callback;−>onCo


ntentChanged( )V return_ > 3380905.00’,


0.0001078366650393867)









The identification unit 500 identifies and provides important Smali code sequence from which the classification result of the target application is derived.


The identification unit 500 may include an important Smali code sequence identifier to output important Smali code sequences having a greatest influence when the classification unit 400 derives the result.


Accordingly, when the target application belongs to a category, the important Smali code sequences of the target application in the category are outputted.


As an embodiment, when the target application is classified as music_and_audio category, the important Smali code sequences are outputted as below.














 “sget_ Landroid/os/Build$VERSION;->SDK_INTI const_ “21” if_ invoke_


Landroid/widget/ImageView;->setImageMatrix(Landroid/graphics/Matrix;)V goto_sget_


Landroidx/transition/ImageViewUtils;->sAnimateTransformMethodLjava/lang/reflect/Method;


if_ const_ “1” new_ [Ljava/lang/Object; const_ “0” aput_ invoke_Ljava/lang/reflect/Method;-


>invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object; goto_ move_ new_


Ljava/lang/RuntimeException;  invoke_Ljava/lang/reflect/InvocationTargetException;-


>getCause( )Ljava/lang/Throwable;move_  invoke_  Ljava/lang/RuntimeException;-


><init>(Ljava/lang/Throwable;)Vthrow_ return_”


 ...


 “invoke_ Landroidx/appcompat/app/AppCompatDelegateImpl;-


>ensureSubDecor( )Viget_


Landroidx/appcompat/app/AppCompatDelegateImpl;->mSubDecorLandroid/view/Vie


wGroup; constv1 “16908290” invoke_


Landroid/view/ViewGroup;->findViewById(I)Landroid/view/View; move_ check_


Landroid/view/ViewGroup; invoke_


Landroid/view/ViewGroup;->removeAllViews( )V iget_


Landroidx/appcompat/app/AppCompatDelegateImpl;->mContextLandroid/content/Co


ntext; invoke_


Landroid/view/LayoutInflater;->from(Landroid/content/Context;)Landroid/view/Layo


utInflater; move_ invoke_


Landroid/view/LayoutInflater;->inflate(ILandroid/view/ViewGroup;)Landroid/view/Vi


ew; iget_


Landroidx/appcompat/app/AppCompatDelegateImpl;->mAppCompatWindowCallback


Landroidx/appcompat/app/AppCompatDelegateImpl$AppCompatWindowCallback;


invoke_


Landroidx/appcompat/app/AppCompatDelegateImpl$AppCompatWindowCallback;->


getWrapped( )Landroid/view/Window$Callback; move_ invoke_


Landroid/view/Window$Callback;->onContentChanged( )V return_ ”









The output important Smali code sequences may be used to check if the APK file was properly classified and identify the code to be protected from attackers, to provide and make use of a list to which the protection technique is to be applied.



FIG. 4 is a flowchart of a code sequence based intelligent key code identification method according to an embodiment of the present disclosure.


The code sequence based intelligent key code identification method according to this embodiment may be performed substantially in the same configuration as the device 10 of FIG. 1. Accordingly, the same reference sign is given to the same element as the device 10 of FIG. 1, and repetitious descriptions are omitted herein.


Additionally, the code sequence based intelligent key code identification method according to this embodiment may be performed by software (application) for performing code sequence based intelligent key code identification.


The present disclosure evaluates an application by identifying important Smali code sequence of the application based on Smali code of the application. Since Smali code is based on source code of the application being actually run, when Smali code sequence converted from


Smali code is used, the actual execution flow of the application is reflected, which makes it possible to objectively evaluate the application.


Referring to FIG. 4, the code sequence based intelligent key code identification method according to this embodiment includes the step of extracting Smali code sequence by decompiling an application (S10). In the step of extracting the Smali code sequence, Smali code is extracted by decompiling the application for each category, and converted to Smali code sequence.


The extracted Smali code sequence is vectorized to construct a training dataset (S20). In the step of constructing the training dataset, the training dataset is built using all the extracted Smali code sequences, and to use as input of a deep learning model, the training dataset is vectorized.


The deep learning model is trained with the vectorized Smali code sequences to generate a classifier (S30).


A category classification result is generated using Smali code sequence of a target application as input of the classifier (S40). The step of generating the category classification result may include classifying as a category having the highest probability among categories that will be classified for the target application.


Important Smali code sequence is identified and provided, the important Smali code sequence from which the classification result of the target application is derived (S50). In the step of identifying and providing the important Smali code sequence, Local Interpretable Model-Agnostic Explanation (LIME), which is an algorithm that provides description of the deep learning model, may be used.


The important Smali code sequences having a greatest influence when deriving the result of the Smali code sequence are outputted (S60). Accordingly, in a category to which the target application belongs, the important Smali code sequences of the target application are outputted.


According to the code sequence based intelligent key code identification method, since Smali code based on source code of the application being actually run is extracted and converted to Smali code sequence, the actual execution flow of the application is identified, and the important Smali code sequence of the application is identified along with classifying the category of the application using the deep learning model. Accordingly, security is strengthened by identifying the important Smali code sequence based on the actual execution, so it is expected to prevent damage caused by malicious behavior.


The code sequence based intelligent key code identification method may be implemented in the form of applications or program instructions that can be executed through a variety of computer components, and recorded in computer-readable recording media. The computer-readable recording media may include program instructions, data files and data structures, alone or in combination.


The program instructions recorded in the computer-readable recording media may be specially designed and configured for the present disclosure and may be those known and available to persons having ordinary skill in the field of computer software.


Examples of the computer-readable recording media include hardware devices specially designed to store and execute the program instructions, for example, magnetic media such as hard disk, floppy disk and magnetic tape, optical media such as CD-ROM and DVD, magneto-optical media such as floptical disk, and ROM, RAM, and flash memory.


Examples of the program instructions include machine code generated by a compiler as well as high-level language code that can be executed by a computer using an interpreter. The hardware device may be configured to act as one or more software modules to perform the processing according to the present disclosure, and vice versa.


While the present disclosure has been hereinabove described with reference to the embodiments, those skilled in the art will understand that various modifications and changes may be made thereto without departing from the spirit and scope of the present disclosure defined in the appended claims.


INDUSTRIAL APPLICABILITY

The present disclosure evaluates an application by identifying important Smali code sequence of the application based on Smali code sequence of the application. Since Smali code is based on source code of the application being actually run, when Smali code sequence converted using Smali code is used, the actual execution flow of the application is reflected well, which makes it possible to objectively evaluate the application.


Accordingly, it can be used as a mobile application key code detection tool for identifying key code through code analysis of mobile applications, thereby preventing damage caused by malicious behavior.


DETAILED DESCRIPTION OF MAIN ELEMENTS


10: Code sequence based intelligent key code identification device



100: Sequence extraction unit



200: Vectorization unit



300: Learning unit



400: Classification unit



500: Identification unit

Claims
  • 1-11. (canceled)
  • 12. A code sequence based intelligent key code identification method, comprising: extracting Smali code sequences by decompiling a plurality of applications;constructing a training dataset by vectorizing the extracted Smali code sequences;training a deep learning model with the vectorized Smali code sequences to generate a classifier;generating a category classification result using Smali code sequences of a target application as input of the classifier; andidentifying and providing an important Smali code sequence from which the category classification result of the target application is derived.
  • 13. The method according to claim 12, wherein the constructing the training dataset comprises: constructing the training dataset using all the extracted Smali code sequences; andvectorizing the training dataset to use as input of the deep learning model.
  • 14. The method of claim 12, wherein the extracting the Smali code sequences comprises: extracting Smali code by decompiling the plurality of applications for each category; andconverting the Smali code to the Smali code sequences.
  • 15. The method of claim 12, wherein the generating the category classification result comprises classifying the target application as a category having a highest probability among categories that will be classified for the target application.
  • 16. The method of claim 12, wherein the identifying and providing the important Smali code sequence uses Local Interpretable Model-Agnostic Explanation (LIME) which is an algorithm that provides description of the deep learning model.
  • 17. A non-transitory computer-readable storage medium having recorded thereon a computer program for performing a code sequence based intelligent key code identification method, the method comprising: extracting Smali code sequences by decompiling a plurality of applications;constructing a training dataset by vectorizing the extracted Smali code sequences;training a deep learning model with the vectorized Smali code sequences to generate a classifier;generating a category classification result using Smali code sequences of a target application as input of the classifier; andidentifying and providing an important Smali code sequence from which the category classification result of the target application is derived.
  • 18. A code sequence based intelligent key code identification device, comprising: a sequence extraction unit extracting Smali code sequences by decompiling a plurality of applications;a vectorization unit constructing a training dataset by vectorizing the extracted Smali code sequences;a learning unit training a deep learning model with the vectorized Smali code sequences to generate a classifier;a classification unit generating a category classification result using Smali code sequences of a target application as input of the classifier; andan identification unit identifying and providing an important Smali code sequence from which the category classification result of the target application is derived.
  • 19. The device of claim 18, wherein the vectorization unit comprises: a dataset generation unit constructing the training dataset using all the extracted Smali code sequences; andan embedding unit vectorizing the training dataset to use as input of the deep learning model.
  • 20. The device of claim 18, wherein the sequence extraction unit comprises: a Smali code unit extracting Smali code by decompiling the plurality of applications for each category; anda Smali sequence conversion unit converting the Smali code to the Smali code sequences.
  • 21. The device of claim 18, wherein the classification unit classifies the target application as a category having a highest probability among categories that will be classified for the target application.
  • 22. The device of claim 18, wherein the identification unit uses Local Interpretable Model-Agnostic Explanation (LIME) which is an algorithm that provides description of the deep learning model.
Priority Claims (1)
Number Date Country Kind
10-2020-0182736 Dec 2020 KR national
PCT Information
Filing Document Filing Date Country Kind
PCT/KR2021/001123 1/28/2021 WO 00