US20170004455A1 - Nonlinear featurization of decision trees for linear regression modeling - Google Patents
Nonlinear featurization of decision trees for linear regression modeling Download PDFInfo
- Publication number
- US20170004455A1 US20170004455A1 US14/788,717 US201514788717A US2017004455A1 US 20170004455 A1 US20170004455 A1 US 20170004455A1 US 201514788717 A US201514788717 A US 201514788717A US 2017004455 A1 US2017004455 A1 US 2017004455A1
- Authority
- US
- United States
- Prior art keywords
- decision tree
- social network
- job
- member profile
- expression
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000003066 decision tree Methods 0.000 title claims abstract description 83
- 238000012417 linear regression Methods 0.000 title abstract description 12
- 238000007477 logistic regression Methods 0.000 claims abstract description 27
- 238000000034 method Methods 0.000 claims description 32
- 230000006870 function Effects 0.000 claims description 9
- 230000009131 signaling function Effects 0.000 claims 1
- 238000012549 training Methods 0.000 description 18
- 230000015654 memory Effects 0.000 description 8
- 238000013459 approach Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 238000013145 classification model Methods 0.000 description 5
- 230000006855 networking Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 4
- 230000001186 cumulative effect Effects 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000012790 confirmation Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 239000013598 vector Substances 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000005291 magnetic effect Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/105—Human resources
- G06Q10/1053—Employment or hiring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2246—Trees, e.g. B+trees
-
- G06F17/30327—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
- G06N5/025—Extracting rules from data
-
- G06N99/005—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Definitions
- This application relates to the technical fields of software and/or hardware technology and, in one example embodiment, to nonlinear featurization of decision trees for linear regression modeling in the context of on-line social network data.
- An on-line social network may be viewed as a platform to connect people in virtual space.
- An on-line social network may be a web-based platform, such as, e.g., a social networking web site, and may be accessed by a use via a web browser or via a mobile application provided on a mobile phone, a tablet, etc.
- An on-line social network may be a business-focused social network that is designed specifically for the business community, where registered members establish and document networks of people they know and trust professionally. Each registered member may be represented by a member profile.
- a member profile may be include one or more web pages, or a structured representation of the member's information in XML (Extensible Markup Language), JSON (JavaScript Object Notation), etc.
- a member's profile web page of a social networking web site may emphasize employment history and education of the associated member.
- An on-line social network may include one or more components for matching member profiles with those job postings that may be of interest to the associated member.
- FIG. 1 is a diagrammatic representation of a network environment within which an example method and system to utilize nonlinear featurization of decision trees for linear regression modeling in the context of an on-line social network data may be implemented;
- FIG. 2 is a diagram of an architecture for nonlinear featurization of decision trees for linear regression modeling in the context of an on-line social network data, in accordance with one example embodiment
- FIG. 3 is an illustration of the use of decision trees as a learning to rank algorithm, in accordance with one example embodiment
- FIG. 4 is another illustration of an example decision tree
- FIG. 5 is a diagram of an architecture combining learning to rank and binary classification, in accordance with one example embodiment
- FIG. 6 is block diagram of a recommendation system, in accordance with one example embodiment
- FIG. 7 is a flow chart of a method to utilize nonlinear featurization of decision trees for linear regression modeling in the context of an on-line social network data, in accordance with an example embodiment.
- FIG. 8 is a diagrammatic representation of an example machine in the form of a computer system within which a set of instructions, for causing the machine to perform any one or more of the methodologies discussed herein, may be executed.
- Nonlinear featurization of decision trees for linear regression modeling in the context of an on-line social network is described.
- numerous specific details are set forth in order to provide a thorough understanding of an embodiment of the present invention. It will be evident, however, to one skilled in the art that the present invention may be practiced without these specific details.
- the term “or” may be construed in either an inclusive or exclusive sense.
- the term “exemplary” is merely to mean an example of something or an exemplar and not necessarily a preferred or ideal means of accomplishing a goal.
- any type of server environment including various system architectures, may employ various embodiments of the application-centric resources system and method described herein and is considered as being within a scope of the present invention.
- an on-line social networking application may be referred to as and used interchangeably with the phrase “an on-line social network” or merely “a social network.”
- an on-line social network may be any type of an on-line social network, such as, e.g., a professional network, an interest-based network, or any on-line networking system that permits users to join as registered members.
- registered members of an on-line social network may be referred to as simply members.
- Each member of an on-line social network is represented by a member profile (also referred to as a profile of a member or simply a profile).
- the profile information of a social network member may include personal information such as, e.g., the name of the member, current and previous geographic location of the member, current and previous employment information of the member, information related to education of the member, information about professional accomplishments of the member, publications, patents, etc.
- the profile information of a social network member may also include information about the member's professional skills, such as, e.g., “product management,” “patent prosecution,” “image processing,” etc.).
- the profile of a member may also include information about the member's current and past employment, such as company identifications, professional titles held by the associated member at the respective companies, as well as the member's dates of employment at those companies.
- An on-line social network system also maintains information about various companies, as well as so-called job postings.
- a job posting for the purposes of this description is an electronically stored entity that includes information that an employer may post with respect to a job opening. The information in a job posting may include, e.g., the industry, job position, required and/or desirable skills, geographic location of the job, the name of a company, etc.
- the on-line social network system includes or is in communication with a so-called recommendation system.
- a recommendation system is configured to match member profiles with job postings, so that those job postings that have been identified as potentially being of interest to a member represented by a particular member profile are presented to the member on a display device for viewing. In one embodiment, the job postings that are identified as of potential interest to a member are presented to the member in order of relevance with respect to the associated member profile.
- Member profiles and job postings are represented in the on-line social network system by feature vectors.
- the features in the feature vectors may represent, e.g., a job industry, a professional field, a job title, a company name, professional seniority, geographic location, etc.
- a recommendation engine may include a binary classifier (e.g., in the form of a logistic regression model) that can be trained using a set of training data.
- the set of training data can be constructed using historical data that indicates whether a certain job posting presented to a certain member resulted in that member applying for that job.
- a trained binary classifier may be used to generate, for a (member profile, job posting) pair, a value indicative of the likelihood that a member represented by the member profile applies for a job represented by the job posting.
- a value indicative of the likelihood that a member represented by the member profile applies for a job represented by the job posting may be referred to as a relevance value or a degree of relevance.
- Those job postings, for which their respective relevance values for a particular member profile are equal to or greater than a predetermined threshold value, are presented to that particular member, e.g., on the news feed page of the member or on some other page provided by the on-line social networking system.
- Job postings presented to a member may be ordered based on their respective relevance values, such that those job postings that are determined to be more relevant (where the recommendation system determined that the member is more likely to apply for jobs represented by those listings as opposed to the jobs represented by other postings) are presented in such a manner that they would be more noticeable by the member, e.g. in a higher position in the list of relevant job postings.
- a recommendation engine that is provided in the form of a binary classifier trains a binary classification model on the (member profile, job posting) pairs and their corresponding labels that indicate whether or not the member represented by the member profile has applied for the job represented by the job posting.
- the binary classification model would learn global weights that are optimized to fit all the (member profile, job posting) pairs in the data set. If the binary classification model treats each (member profile, job posting) pair equally, the overall optimization result may be biased towards those member profiles that have been paired with a larger number of job postings as compared to those member profiles that have been paired with a fewer number of job postings.
- the algorithm may unduly emphasize unimportant or even irrelevant job postings (e.g., those job postings that were ignored and not viewed by a respective member).
- the degree of relevance may not always be well modeled. For instance, it does not take into consideration that even if a member does not apply for certain jobs, a job posting that is impressed but not clicked by the member may be inferred to be less relevant than the one that is impressed and clicked by the same member.
- a learning to rank approach may be utilized beneficially to address some of these problems, as it takes into consideration multiple ordered categories of relevance labels, such as, e.g., Perfect>Excellent>Good>Fair>Bad.
- a learning to rank model can learn from pairwise preference (e.g., job posting A is more relevant than job posting B for a particular member profile) thus directly optimizing for the rank order of job postings for each member profile. With ranking position taken into consideration during training, top-ranked job postings may be treated by the recommendation system as being of more importance than lower-ranked job postings.
- a learning to rank approach may also result in an equal optimization across all member profiles and help minimize bias towards those member profiles that have been paired with a larger number of job postings.
- a recommendation system may be configured to produce relevance labels mentioned above automatically without human intervention.
- a recommendation system may be configured to generate respective multi-point scale ranking labels for each (member profile, job posting) pairs.
- the labels indicating different degrees of relevance may be, e.g., in the format of Bad, Fair, Good, Excellent, and Perfect.
- a recommendation system may train a ranking model (also referred to as a learning to rank model) that may be used by a ranker module of the recommendation system to rank job postings for each member profile, directly optimizing for the order of the ranking results based on a metric such as, e.g., normalized discounted cumulative gain (NDCG).
- NDCG normalized discounted cumulative gain
- the recommendation system constructs respective five-point labels for (member profile, job posting) pairs, utilizing feedback data collected by automatically monitoring member interactions with job postings that have been presented to them.
- the relevance labels are defined as shown below.
- Bad Random randomly generated synthetic (member profile, job posting) pair of an active member profile with an active job posting, where the job posting has not been presented to the associated member, at all or for a certain period of time.
- Fair Impressed (member profile, job posting) pair, where the job posting has been presented to the associated member (impressed), but there has been no further interaction of the associated member with the job positing , such as a click on the job posting to view the details of the posting.
- Good Clicked (member profile, job posting) pair, where the job posting has been presented to the associated member and the recommendation system also detected a click on the job posting to view the details of the posting, but no further event indicative of applying for the associated job has been detected by the recommendation system.
- Excellent Applied (member profile, job posting) pair, where the job posting has been presented to the associated member, and the recommendation system also detected that the member clicked on the job posting to view the details and applied for the associated job but did not detect a confirmation that the member has been hired for that job.
- the recommendation system uses five degrees of relevance
- a recommendation system may use a lesser or a greater number of degrees, where each degree of relevance corresponds to a respective temporal sequence of events, each one sequentially closer to the final successful action of a member represented by a member profile applying to a job represented by a job posting.
- a learning to rank approach described herein may be utilized beneficially in other settings, e.g., where each degree of relevance corresponds to a respective geographic proximity to a given location.
- a learning to rank model utilized by a recommendation system uses boosted gradient decision trees (BGDT) as the learning to rank algorithm.
- BGDT boosted gradient decision trees
- DCG Discounted Cumulative Gain
- the Discounted Cumulative Gain (DCG) from position 1 to position p in the list of results can be defined as expressed below in Equation (2).
- NDCG can then be calculated as the DCG of the rank ordering, divided by the DCG of the ideal ordering (as if returned by an optimal ranker), which is expressed by Equation (3) below. NDCG is always within range [0,1].
- NDCG DCG ranker DCG ideal Equation ⁇ ⁇ ( 3 )
- the learning to rank algorithm may be in the form of boosted gradient decision trees and can be directly optimized for NDCG (as list-wise optimization).
- the DCG ranker is calculated using the rank scores and DCG ideal is calculated using the relevance labels.
- the error for an intermediate ranker produced during the training process is the difference between DCG ranker and DCG ideal, which can be used in the tree training process with gradient decent.
- a small number of small decision trees e.g., decision trees with five leaves on each tree
- can be trained with boosting where a relevance score for a job posting with respect to a member profile is calculated as the sum of tree scores calculated for that job posting with respect to that member profile using respective decision trees, which is illustrated in a diagram 300 shown in FIG. 3 .
- a decision tree is constructed to determine a ranking score calculated using respective features or respective sets of features from a (member profile, job posting) pair that is the subject of examination.
- one of the decision trees may be constructed to analyze respective job title features from the member profile and the job posting, and also to analyze the job company and location features from the job posting.
- One of the decision nodes from the tree may be to compare to a threshold value the cosine similarity matching score calculated with respect to the job title feature (e.g., represented by a title string) in the member profile and the job title feature from the job posting.
- Another decision node may be to compare to a threshold value a popularity score indicative of how popular is the company and its location represented by the job company and job location features from the job posting.
- the terminal nodes (leaf nodes) of a decision tree represent possible outcomes of applying the decision tree to a (member profile, job posting) pair. The outcomes are also referred to as tree scores. In FIG. 3 , the thicker edges show the decision tracks.
- the decision paths from the decision trees utilized in the ranking model may be automatically converted into a format that can be consumed by a binary classifier.
- This approach may be utilized beneficially to combine the two types of modeling techniques—learning to rank and binary classification—without disrupting the existing binary classification modeling architecture.
- a decision tree path can represent a business rule and can be provided in an if/then else statement format or in an s-expression format.
- a business rule may state that, for a particular member profile the recommendation system should present a first job posting (job 1) if the location of the job is in Seattle; otherwise the recommendation system should present a different job posting (job 2).
- job 1 a first job posting
- job 2 a different job posting
- This business rule can be represented in an if/then else statement format, as shown below.
- the same business rule can be represented in an s-expression format, as shown below.
- the s-expression “(if(>x 0) a b)” is equivalent to the statement “if(x>0) then a else b,” where x, a, b can be either variables or s-expressions.
- computer-implemented converter may be configured to evaluate one or more business rules branch operators ‘if’ provided in the s-expression format and transform the s-expressions into a decision tree that could be then included in the learning to rank algorithm used by a recommendation system.
- the converter may also be configured to read the decision tree structure of a tree that may be included in the learning to rank algorithm, and convert each path from root to a leaf into an s-expression that can be used as an s-expression feature.
- the decision trees can be converted to features and used in existing logistic regression models. The example below is provided with reference to a decision tree 400 shown in FIG. 4 .
- a converter configured to transform s-expressions into a decision tree and also to convert a decision tree structure into s-expressions may contribute to greater flexibility to use a trained tree-based ranker in different settings.
- a trained tree-based ranker can be used to replace existing logistic regression models as a pure ranking solution.
- a trained tree-based ranker, together with the converter, can be combined with existing logistic regression models.
- the s-expressions generated based on a tree structure from a ranking model or from one or more business rules may be utilized as additional nonlinear features for training the logistic regression model. This approach provides optimized non-linear features to a linear model such that it may result in increasing of the linear model's capability of predicting nonlinear patterns.
- the s-expressions generated based on a tree structure from a ranking model or from one or more business rules may be utilized as weighted components in the final score produced bet the logistic regression model, without retraining the existing logistic regression model. This approach may be useful in keeping some or most of the benefits of training a ranking model, while leveraging the benefits of the existing logistic regression model.
- Logistic regression model use sigmoid function to calculate the final score (score), which is expressed below using Equation (4).
- Equation (5) The function ⁇ (x) is modified to incorporate decision tree features in s-expression format as shown below using Equation (5).
- Example method and system to utilize nonlinear featurization of decision trees for linear regression modeling in the context of an on-line social network data may be implemented in the context of a network environment 100 illustrated in FIG. 1 .
- the network environment 100 may include client systems 110 and 120 and a server system 140 .
- the client system 120 may be a mobile device, such as, e.g., a mobile phone or a tablet.
- the server system 140 may host an on-line social network system 142 .
- each member of an on-line social network is represented by a member profile that contains personal and professional information about the member and that may be associated with social links that indicate the member's connection to other member profiles in the on-line social network.
- Member profiles and related information may be stored in a database 150 as member profiles 152 .
- the database 150 may also store job postings that may be viewed by members of the on-line social network system 142 .
- the client systems 110 and 120 may be capable of accessing the server system 140 via a communications network 130 , utilizing, e.g., a browser application 112 executing on the client system 110 , or a mobile application executing on the client system 120 .
- the communications network 130 may be a public network (e.g., the Internet, a mobile communication network, or any other network capable of communicating digital data).
- the server system 140 also hosts a recommendation system 144 .
- the recommendation system 144 may be utilized beneficially to identify and retrieve, from the database 150 , the job postings that are identified as of potential interest to a member represented by a member profile.
- the recommendation system 144 identifies potentially relevant job postings based on respective features that represent the job postings and the member profile.
- the architecture 200 includes a retrieval engine 210 , a ranker 220 , and a training data collector 230 .
- the retrieval engine 210 retrieves a list of recommended jobs 240 from a database 250 for a particular member profile, e.g., using a binary classifier in the form of a logistic regression model.
- the list of recommended jobs 240 may be in a format ⁇ member ID (job-_posting_ID 1 , . . .
- the ranker 220 executes a learning to rank model 222 with respect to the list of recommended jobs 240 to generate a respective rank score for each item in the list.
- the learning to rank model 222 may use boosted gradient decision trees as a learning to rank algorithm, where the terminal leaves in a decision tree represent relevance scores that can be attributed to a job posting with respect to a member profile.
- a rank score for an item in the list is calculated as the sum of rank scores determined for each of the decision trees, as shown in diagram 300 of FIG. 3 . In FIG. 3 , the thicker edges show the decision track.
- the rank scores calculated by the learning to rank model 222 are assigned to the items in the list of recommended jobs 240 .
- a list of recommended jobs with respective assigned rank scores 260 is provided to the training data collector 230 .
- the training data collector 230 monitors events with respect to how the member, for whom the list of recommended jobs 240 was generated, interacts with the associated job postings and, based on the monitors interactions, assigns relevance labels to the items in the list. As explained above, a job posting that is impressed and clicked by the associated member receives a different relevance score from a relevance label assigned to a job posting that was impressed but not clicked by the associated member.
- a list of recommended jobs with respective assigned relevance labels 270 is provided to a repository of training data 280 .
- the training data stored in the database 280 is used to train the learning to rank model 222 .
- the learning to rank model 222 can be optimized for NDCG using the Equation (3) above, where DCG ranker is calculated using the rank scores and DCG ideal is calculated using the relevance labels.
- the retrieval engine 210 uses a binary classifier in the form of a logistic regression model.
- the logistic regression model may be retrained using additional features obtained from one or more decision trees used by the ranker 220 .
- Architecture of the recommendation system may utilize a converter configured to transform s-expressions into a decision tree that may that maybe used by the ranker 220 and also to convert a decision tree structure into s-expressions that can be used as additional features to train logistic regression models.
- An example architecture 500 that includes such converter is shown in FIG. 5 .
- the architecture 500 includes a converter 510 , a learning to rank model 520 , a training data repository 530 , a logistic regression model 540 , and a repository 560 storing member profiles, job postings, and business rules.
- the converter 510 reads the decision tree structure of a tree that may be included in the learning to rank model 520 , and converts each path from root to a leaf into an s-expression.
- the s-expressions derived from decision trees of the learning to rank model 520 are then used as training data 530 to train the logistic regression model 540 .
- the logistic regression model 540 is executed to produce relevance scores 550 for (member profile, job posting) pairs obtained from the repository 560 .
- the converter 510 may also be configured to transform business rules that may be stored in the repository 560 as branch operators provided in the s-expression format into a decision tree that could be then included in the learning to rank model 520 .
- FIG. 6 is a block diagram of a system 600 to utilize nonlinear featurization of decision trees for linear regression modeling in the context of an on-line social network data, in accordance with one example embodiment.
- the system 600 includes a learning to rank module 610 , a converter 620 , a classifier 630 , and a presentation module 640 .
- the learning to rank module 610 is configured to learn a ranking model that uses decision trees as a learning to rank algorithm.
- the converter 620 is configured to read a decision tree structure of a particular decision tree from the decision trees used in the ranking model and to convert a path from root to a leaf in the particular decision tree into an s-expression.
- the classifier 630 is configured to generate a recommended jobs list for a member profile representing a member in an on-line social network system, utilizing the s-expression as a feature in a logistic regression model. As explained above, the classifier 630 may use the s-expression as an additional non-linear feature in retraining the logistic regression model. In some embodiments, the classifier 630 uses the s-expression as an additional non-linear feature in calculating a relevance score for a (member profile, job posting) pair. For example, the sigmoid function used by the classifier 630 to calculate relevance scores may be modified to incorporate the s-expression as an additional non-linear feature.
- the items in the recommended jobs list generated by the classifier 630 are references to job postings from a plurality of job postings maintained in the on-line social network system 142 of FIG. 1 .
- the presentation module 640 is configured to cause items from the recommended jobs list to be presented on a display device of a member represented by the member profile in an on-line social network system.
- the converter 620 may be further configured to access one or more business rules stored in the form of s-expressions, construct a decision tree based on these business rules stored in the form of s-expressions, and include the decision tree into the ranking model used by the learning to rank module 610 .
- a business may be, e.g., related to a job title represented by a feature from a member profile maintained in the on-line social network system 142 .
- FIG. 7 is a flow chart of a method 700 to utilize nonlinear featurization of decision trees for linear regression modeling in the context of an on-line social network data to a social network member, according to one example embodiment.
- the method 700 may be performed by processing logic that may comprise hardware (e.g., dedicated logic, programmable logic, microcode, etc.), software (such as run on a general purpose computer system or a dedicated machine), or a combination of both.
- the processing logic resides at the server system 140 of FIG. 1 and, specifically, at the system 600 shown in FIG. 6 .
- the method 700 commences at operation 710 , when the learning to rank module 610 learns a ranking model that uses decision trees as a learning to rank algorithm.
- the converter 620 reads a decision tree structure of a particular decision tree from the decision trees used in the ranking model at operation 720 and converts a path from root to a leaf in the particular decision tree into an s-expression at operation 730 .
- the classifier 630 generates a recommended jobs list for a member profile representing a member in an on-line social network system, utilizing the s-expression as a feature in a logistic regression model, at operation 740 .
- the presentation module 640 causes items from the recommended jobs list to be presented on a display device of a member represented by the member profile in an on-line social network system.
- processors may be temporarily configured (e.g., by software) or permanently configured to perform the relevant operations. Whether temporarily or permanently configured, such processors may constitute processor-implemented modules that operate to perform one or more operations or functions.
- the modules referred to herein may, in some example embodiments, comprise processor-implemented modules.
- the methods described herein may be at least partially processor-implemented. For example, at least some of the operations of a method may be performed by one or more processors or processor-implemented modules. The performance of certain of the operations may be distributed among the one or more processors, not only residing within a single machine, but deployed across a number of machines. In some example embodiments, the processor or processors may be located in a single location (e.g., within a home environment, an office environment or as a server farm), while in other embodiments the processors may be distributed across a number of locations.
- FIG. 8 is a diagrammatic representation of a machine in the example form of a computer system 800 within which a set of instructions, for causing the machine to perform any one or more of the methodologies discussed herein, may be executed.
- the machine operates as a stand-alone device or may be connected (e.g., networked) to other machines.
- the machine may operate in the capacity of a server or a client machine in a server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment.
- the machine may be a personal computer (PC), a tablet PC, a set-top box (STB), a Personal Digital Assistant (PDA), a cellular telephone, a web appliance, a network router, switch or bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine.
- PC personal computer
- PDA Personal Digital Assistant
- STB set-top box
- WPA Personal Digital Assistant
- the example computer system 800 includes a processor 802 (e.g., a central processing unit (CPU), a graphics processing unit (GPU) or both), a main memory 804 and a static memory 806 , which communicate with each other via a bus 808 .
- the computer system 800 may further include a video display unit 810 (e.g., a liquid crystal display (LCD) or a cathode ray tube (CRT)).
- the computer system 800 also includes an alpha-numeric input device 812 (e.g., a keyboard), a user interface (UI) navigation device 814 (e.g., a cursor control device), a disk drive unit 816 , a signal generation device 818 (e.g., a speaker) and a network interface device 820 .
- UI user interface
- the computer system 800 also includes an alpha-numeric input device 812 (e.g., a keyboard), a user interface (UI) navigation device 814 (e.g., a cursor control device), a disk drive unit 816 , a signal generation device 818 (e.g., a speaker) and a network interface device 820 .
- UI user interface
- the disk drive unit 816 includes a machine-readable medium 822 on which is stored one or more sets of instructions and data structures (e.g., software 824 ) embodying or utilized by any one or more of the methodologies or functions described herein.
- the software 824 may also reside, completely or at least partially, within the main memory 804 and/or within the processor 802 during execution thereof by the computer system 800 , with the main memory 804 and the processor 802 also constituting machine-readable media.
- the software 824 may further be transmitted or received over a network 826 via the network interface device 820 utilizing any one of a number of well-known transfer protocols (e.g., Hyper Text Transfer Protocol (HTTP)).
- HTTP Hyper Text Transfer Protocol
- machine-readable medium 822 is shown in an example embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions.
- the term “machine-readable medium” shall also be taken to include any medium that is capable of storing and encoding a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of embodiments of the present invention, or that is capable of storing and encoding data structures utilized by or associated with such a set of instructions.
- the term “machine-readable medium” shall accordingly be taken to include, but not be limited to, solid-state memories, optical and magnetic media. Such media may also include, without limitation, hard disks, floppy disks, flash memory cards, digital video disks, random access memory (RAMs), read only memory (ROMs), and the like.
- inventions described herein may be implemented in an operating environment comprising software installed on a computer, in hardware, or in a combination of software and hardware.
- inventive subject matter may be referred to herein, individually or collectively, by the term “invention” merely for convenience and without intending to voluntarily limit the scope of this application to any single invention or inventive concept if more than one is, in fact, disclosed.
- Modules may constitute either software modules (e.g., code embodied (1) on a non-transitory machine-readable medium or (2) in a transmission signal) or hardware-implemented modules.
- a hardware-implemented module is tangible unit capable of performing certain operations and may be configured or arranged in a certain manner.
- one or more computer systems e.g., a standalone, client or server computer system
- one or more processors may be configured by software (e.g., an application or application portion) as a hardware-implemented module that operates to perform certain operations as described herein.
- a hardware-implemented module may be implemented mechanically or electronically.
- a hardware-implemented module may comprise dedicated circuitry or logic that is permanently configured (e.g., as a special-purpose processor, such as a field programmable gate array (FPGA) or an application-specific integrated circuit (ASIC)) to perform certain operations.
- a hardware-implemented module may also comprise programmable logic or circuitry (e.g., as encompassed within a general-purpose processor or other programmable processor) that is temporarily configured by software to perform certain operations. It will be appreciated that the decision to implement a hardware-implemented module mechanically, in dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by cost and time considerations.
- the term “hardware-implemented module” should be understood to encompass a tangible entity, be that an entity that is physically constructed, permanently configured (e.g., hardwired) or temporarily or transitorily configured (e.g., programmed) to operate in a certain manner and/or to perform certain operations described herein.
- hardware-implemented modules are temporarily configured (e.g., programmed)
- each of the hardware-implemented modules need not be configured or instantiated at any one instance in time.
- the hardware-implemented modules comprise a general-purpose processor configured using software
- the general-purpose processor may be configured as respective different hardware-implemented modules at different times.
- Software may accordingly configure a processor, for example, to constitute a particular hardware-implemented module at one instance of time and to constitute a different hardware-implemented module at a different instance of time.
- Hardware-implemented modules can provide information to, and receive information from, other hardware-implemented modules. Accordingly, the described hardware-implemented modules may be regarded as being communicatively coupled. Where multiple of such hardware-implemented modules exist contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) that connect the hardware-implemented modules. In embodiments in which multiple hardware-implemented modules are configured or instantiated at different times, communications between such hardware-implemented modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple hardware-implemented modules have access. For example, one hardware-implemented module may perform an operation, and store the output of that operation in a memory device to which it is communicatively coupled.
- a further hardware-implemented module may then, at a later time, access the memory device to retrieve and process the stored output.
- Hardware-implemented modules may also initiate communications with input or output devices, and can operate on a resource (e.g., a collection of information).
- processors may be temporarily configured (e.g., by software) or permanently configured to perform the relevant operations. Whether temporarily or permanently configured, such processors may constitute processor-implemented modules that operate to perform one or more operations or functions.
- the modules referred to herein may, in some example embodiments, comprise processor-implemented modules.
- the methods described herein may be at least partially processor-implemented. For example, at least some of the operations of a method may be performed by one or processors or processor-implemented modules. The performance of certain of the operations may be distributed among the one or more processors, not only residing within a single machine, but deployed across a number of machines. In some example embodiments, the processor or processors may be located in a single location (e.g., within a home environment, an office environment or as a server farm), while in other embodiments the processors may be distributed across a number of locations.
- the one or more processors may also operate to support performance of the relevant operations in a “cloud computing” environment or as a “software as a service” (SaaS). For example, at least some of the operations may be performed by a group of computers (as examples of machines including processors), these operations being accessible via a network (e.g., the Internet) and via one or more appropriate interfaces (e.g., Application Program Interfaces (APIs).)
- SaaS software as a service
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Economics (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Marketing (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Nonlinear featurization of decision trees for linear regression modeling in the context of an on-line social network is described. A computer-implemented converter is provided that is capable of reading a decision tree structure that is included in the learning to rank algorithm and convert each path from root to a leaf into an s-expression. The s-expressions are used as additional features to train a logistic regression model.
Description
- This application relates to the technical fields of software and/or hardware technology and, in one example embodiment, to nonlinear featurization of decision trees for linear regression modeling in the context of on-line social network data.
- An on-line social network may be viewed as a platform to connect people in virtual space. An on-line social network may be a web-based platform, such as, e.g., a social networking web site, and may be accessed by a use via a web browser or via a mobile application provided on a mobile phone, a tablet, etc. An on-line social network may be a business-focused social network that is designed specifically for the business community, where registered members establish and document networks of people they know and trust professionally. Each registered member may be represented by a member profile. A member profile may be include one or more web pages, or a structured representation of the member's information in XML (Extensible Markup Language), JSON (JavaScript Object Notation), etc. A member's profile web page of a social networking web site may emphasize employment history and education of the associated member. An on-line social network may include one or more components for matching member profiles with those job postings that may be of interest to the associated member.
- Embodiments of the present invention are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like reference numbers indicate similar elements and in which:
-
FIG. 1 is a diagrammatic representation of a network environment within which an example method and system to utilize nonlinear featurization of decision trees for linear regression modeling in the context of an on-line social network data may be implemented; -
FIG. 2 is a diagram of an architecture for nonlinear featurization of decision trees for linear regression modeling in the context of an on-line social network data, in accordance with one example embodiment; -
FIG. 3 is an illustration of the use of decision trees as a learning to rank algorithm, in accordance with one example embodiment; -
FIG. 4 is another illustration of an example decision tree; -
FIG. 5 is a diagram of an architecture combining learning to rank and binary classification, in accordance with one example embodiment; -
FIG. 6 is block diagram of a recommendation system, in accordance with one example embodiment; -
FIG. 7 is a flow chart of a method to utilize nonlinear featurization of decision trees for linear regression modeling in the context of an on-line social network data, in accordance with an example embodiment; and -
FIG. 8 is a diagrammatic representation of an example machine in the form of a computer system within which a set of instructions, for causing the machine to perform any one or more of the methodologies discussed herein, may be executed. - Nonlinear featurization of decision trees for linear regression modeling in the context of an on-line social network is described. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of an embodiment of the present invention. It will be evident, however, to one skilled in the art that the present invention may be practiced without these specific details.
- As used herein, the term “or” may be construed in either an inclusive or exclusive sense. Similarly, the term “exemplary” is merely to mean an example of something or an exemplar and not necessarily a preferred or ideal means of accomplishing a goal. Additionally, although various exemplary embodiments discussed below may utilize Java-based servers and related environments, the embodiments are given merely for clarity in disclosure. Thus, any type of server environment, including various system architectures, may employ various embodiments of the application-centric resources system and method described herein and is considered as being within a scope of the present invention.
- For the purposes of this description the phrase “an on-line social networking application” may be referred to as and used interchangeably with the phrase “an on-line social network” or merely “a social network.” It will also be noted that an on-line social network may be any type of an on-line social network, such as, e.g., a professional network, an interest-based network, or any on-line networking system that permits users to join as registered members. For the purposes of this description, registered members of an on-line social network may be referred to as simply members.
- Each member of an on-line social network is represented by a member profile (also referred to as a profile of a member or simply a profile). The profile information of a social network member may include personal information such as, e.g., the name of the member, current and previous geographic location of the member, current and previous employment information of the member, information related to education of the member, information about professional accomplishments of the member, publications, patents, etc. The profile information of a social network member may also include information about the member's professional skills, such as, e.g., “product management,” “patent prosecution,” “image processing,” etc.). The profile of a member may also include information about the member's current and past employment, such as company identifications, professional titles held by the associated member at the respective companies, as well as the member's dates of employment at those companies.
- An on-line social network system also maintains information about various companies, as well as so-called job postings. A job posting, for the purposes of this description is an electronically stored entity that includes information that an employer may post with respect to a job opening. The information in a job posting may include, e.g., the industry, job position, required and/or desirable skills, geographic location of the job, the name of a company, etc. The on-line social network system includes or is in communication with a so-called recommendation system. A recommendation system is configured to match member profiles with job postings, so that those job postings that have been identified as potentially being of interest to a member represented by a particular member profile are presented to the member on a display device for viewing. In one embodiment, the job postings that are identified as of potential interest to a member are presented to the member in order of relevance with respect to the associated member profile.
- Member profiles and job postings are represented in the on-line social network system by feature vectors. The features in the feature vectors may represent, e.g., a job industry, a professional field, a job title, a company name, professional seniority, geographic location, etc. A recommendation engine may include a binary classifier (e.g., in the form of a logistic regression model) that can be trained using a set of training data. The set of training data can be constructed using historical data that indicates whether a certain job posting presented to a certain member resulted in that member applying for that job. A trained binary classifier may be used to generate, for a (member profile, job posting) pair, a value indicative of the likelihood that a member represented by the member profile applies for a job represented by the job posting. A value indicative of the likelihood that a member represented by the member profile applies for a job represented by the job posting may be referred to as a relevance value or a degree of relevance. Those job postings, for which their respective relevance values for a particular member profile are equal to or greater than a predetermined threshold value, are presented to that particular member, e.g., on the news feed page of the member or on some other page provided by the on-line social networking system. Job postings presented to a member may be ordered based on their respective relevance values, such that those job postings that are determined to be more relevant (where the recommendation system determined that the member is more likely to apply for jobs represented by those listings as opposed to the jobs represented by other postings) are presented in such a manner that they would be more noticeable by the member, e.g. in a higher position in the list of relevant job postings.
- A recommendation engine that is provided in the form of a binary classifier trains a binary classification model on the (member profile, job posting) pairs and their corresponding labels that indicate whether or not the member represented by the member profile has applied for the job represented by the job posting. The binary classification model would learn global weights that are optimized to fit all the (member profile, job posting) pairs in the data set. If the binary classification model treats each (member profile, job posting) pair equally, the overall optimization result may be biased towards those member profiles that have been paired with a larger number of job postings as compared to those member profiles that have been paired with a fewer number of job postings. If the binary classification model treats equally each job posting pared with a member profile, regardless, e.g., of whether the associated member viewed the job posting or not, such that the respective positions of job postings in the ranked list are invisible in the learning process, the algorithm may unduly emphasize unimportant or even irrelevant job postings (e.g., those job postings that were ignored and not viewed by a respective member). In the binary classification model, the degree of relevance may not always be well modeled. For instance, it does not take into consideration that even if a member does not apply for certain jobs, a job posting that is impressed but not clicked by the member may be inferred to be less relevant than the one that is impressed and clicked by the same member.
- A learning to rank approach may be utilized beneficially to address some of these problems, as it takes into consideration multiple ordered categories of relevance labels, such as, e.g., Perfect>Excellent>Good>Fair>Bad. A learning to rank model can learn from pairwise preference (e.g., job posting A is more relevant than job posting B for a particular member profile) thus directly optimizing for the rank order of job postings for each member profile. With ranking position taken into consideration during training, top-ranked job postings may be treated by the recommendation system as being of more importance than lower-ranked job postings. In addition, a learning to rank approach may also result in an equal optimization across all member profiles and help minimize bias towards those member profiles that have been paired with a larger number of job postings. In one example embodiment, a recommendation system may be configured to produce relevance labels mentioned above automatically without human intervention.
- A recommendation system may be configured to generate respective multi-point scale ranking labels for each (member profile, job posting) pairs. The labels indicating different degrees of relevance may be, e.g., in the format of Bad, Fair, Good, Excellent, and Perfect. Using such label data, a recommendation system may train a ranking model (also referred to as a learning to rank model) that may be used by a ranker module of the recommendation system to rank job postings for each member profile, directly optimizing for the order of the ranking results based on a metric such as, e.g., normalized discounted cumulative gain (NDCG).
- In one example embodiment, in order to train a learning to rank model, the recommendation system constructs respective five-point labels for (member profile, job posting) pairs, utilizing feedback data collected by automatically monitoring member interactions with job postings that have been presented to them. In one embodiment, the relevance labels are defined as shown below.
-
-
- Good Clicked: (member profile, job posting) pair, where the job posting has been presented to the associated member and the recommendation system also detected a click on the job posting to view the details of the posting, but no further event indicative of applying for the associated job has been detected by the recommendation system.
- Excellent Applied: (member profile, job posting) pair, where the job posting has been presented to the associated member, and the recommendation system also detected that the member clicked on the job posting to view the details and applied for the associated job but did not detect a confirmation that the member has been hired for that job.
- Perfect Hired: (member profile, job posting) pair, where the recommendation system detected a confirmation that the member has been hired for that job. There are multiple ways to infer hired event within our system, e.g. a) directly through recruiter feedbacks, and b) through members' job change events, which can be further inferred from member updating certain fields of their profile, such as changing of job location, job title and job company.
- It will be noted that although, in one embodiment, the recommendation system uses five degrees of relevance, a recommendation system may use a lesser or a greater number of degrees, where each degree of relevance corresponds to a respective temporal sequence of events, each one sequentially closer to the final successful action of a member represented by a member profile applying to a job represented by a job posting. A learning to rank approach described herein may be utilized beneficially in other settings, e.g., where each degree of relevance corresponds to a respective geographic proximity to a given location.
- In one example embodiment, a learning to rank model utilized by a recommendation system uses boosted gradient decision trees (BGDT) as the learning to rank algorithm. Once the recommendation system generates multi-point scale relevance labels, it converts these labels into numeric gains and uses the respective Discounted Cumulative Gain (DCG) values as measurements and targets for the model training Table 1 below illustrates how different labels correspond to respective relevance values (identified as “Grade” in Table 1) and respective gains (identified as “Gain” in Table 1).
-
TABLE 1 Label Grade Gain Bad 0 0 Fair 1 1 Good 2 3 Excellent 3 7 Perfect 4 15 - In Table 1, a Gain value is calculated as expressed in Equation (1) below.
-
Gain=2Grade−1 Equation (1) - The Discounted Cumulative Gain (DCG) from
position 1 to position p in the list of results (e.g., in the list of references to recommended job postings) can be defined as expressed below in Equation (2). -
- NDCG can then be calculated as the DCG of the rank ordering, divided by the DCG of the ideal ordering (as if returned by an optimal ranker), which is expressed by Equation (3) below. NDCG is always within range [0,1].
-
- As mentioned above, the learning to rank algorithm may be in the form of boosted gradient decision trees and can be directly optimized for NDCG (as list-wise optimization). In Equation (3) above, the DCGranker is calculated using the rank scores and DCGideal is calculated using the relevance labels. The error for an intermediate ranker produced during the training process is the difference between DCG ranker and DCG ideal, which can be used in the tree training process with gradient decent. A small number of small decision trees (e.g., decision trees with five leaves on each tree) can be trained with boosting, where a relevance score for a job posting with respect to a member profile is calculated as the sum of tree scores calculated for that job posting with respect to that member profile using respective decision trees, which is illustrated in a diagram 300 shown in
FIG. 3 . A decision tree is constructed to determine a ranking score calculated using respective features or respective sets of features from a (member profile, job posting) pair that is the subject of examination. For example, one of the decision trees may be constructed to analyze respective job title features from the member profile and the job posting, and also to analyze the job company and location features from the job posting. One of the decision nodes from the tree may be to compare to a threshold value the cosine similarity matching score calculated with respect to the job title feature (e.g., represented by a title string) in the member profile and the job title feature from the job posting. Another decision node may be to compare to a threshold value a popularity score indicative of how popular is the company and its location represented by the job company and job location features from the job posting. The terminal nodes (leaf nodes) of a decision tree represent possible outcomes of applying the decision tree to a (member profile, job posting) pair. The outcomes are also referred to as tree scores. InFIG. 3 , the thicker edges show the decision tracks. - In one embodiment, the decision paths from the decision trees utilized in the ranking model may be automatically converted into a format that can be consumed by a binary classifier. This approach may be utilized beneficially to combine the two types of modeling techniques—learning to rank and binary classification—without disrupting the existing binary classification modeling architecture.
- Method and system are described to transform decision tree paths into nonlinear s-expression features, which may also be referred to as nonlinear featurization of decision trees for linear regression modeling. A decision tree path can represent a business rule and can be provided in an if/then else statement format or in an s-expression format. For example, a business rule may state that, for a particular member profile the recommendation system should present a first job posting (job 1) if the location of the job is in Seattle; otherwise the recommendation system should present a different job posting (job 2). This business rule can be represented in an if/then else statement format, as shown below.
- {if Location==Seattle, then return
Job 1 -
-
- else Job2}
-
- The same business rule can be represented in an s-expression format, as shown below.
- (if(=location Seattle) Job1 Job2)
- Using another example, the s-expression “(if(>x 0) a b)” is equivalent to the statement “if(x>0) then a else b,” where x, a, b can be either variables or s-expressions.
- In one embodiment, computer-implemented converter may be configured to evaluate one or more business rules branch operators ‘if’ provided in the s-expression format and transform the s-expressions into a decision tree that could be then included in the learning to rank algorithm used by a recommendation system. The converter may also be configured to read the decision tree structure of a tree that may be included in the learning to rank algorithm, and convert each path from root to a leaf into an s-expression that can be used as an s-expression feature. Thus the decision trees can be converted to features and used in existing logistic regression models. The example below is provided with reference to a
decision tree 400 shown inFIG. 4 . - s-expression feature 1 (for left most leaf): “(if (>X 0.5) (if (>Y 0.4) 0.1 0) 0)”
- s-expression feature 2 (for middle leaf): “(if (>X 0.5) (if (>Y 0.4) 0 0.2) 0)”
- s-expression feature 3 (for right most leaf): “(if (>X 0.5) 0 0.3)”
- A converter configured to transform s-expressions into a decision tree and also to convert a decision tree structure into s-expressions may contribute to greater flexibility to use a trained tree-based ranker in different settings. A trained tree-based ranker can be used to replace existing logistic regression models as a pure ranking solution. In some embodiments, a trained tree-based ranker, together with the converter, can be combined with existing logistic regression models. For example, the s-expressions generated based on a tree structure from a ranking model or from one or more business rules may be utilized as additional nonlinear features for training the logistic regression model. This approach provides optimized non-linear features to a linear model such that it may result in increasing of the linear model's capability of predicting nonlinear patterns. In a further embodiment, the s-expressions generated based on a tree structure from a ranking model or from one or more business rules may be utilized as weighted components in the final score produced bet the logistic regression model, without retraining the existing logistic regression model. This approach may be useful in keeping some or most of the benefits of training a ranking model, while leveraging the benefits of the existing logistic regression model.
- Logistic regression model use sigmoid function to calculate the final score (score), which is expressed below using Equation (4).
-
- The function φ(x) is modified to incorporate decision tree features in s-expression format as shown below using Equation (5).
-
- Example method and system to utilize nonlinear featurization of decision trees for linear regression modeling in the context of an on-line social network data may be implemented in the context of a
network environment 100 illustrated inFIG. 1 . - As shown in
FIG. 1 , thenetwork environment 100 may includeclient systems server system 140. Theclient system 120 may be a mobile device, such as, e.g., a mobile phone or a tablet. Theserver system 140, in one example embodiment, may host an on-linesocial network system 142. As explained above, each member of an on-line social network is represented by a member profile that contains personal and professional information about the member and that may be associated with social links that indicate the member's connection to other member profiles in the on-line social network. Member profiles and related information may be stored in adatabase 150 as member profiles 152. Thedatabase 150 may also store job postings that may be viewed by members of the on-linesocial network system 142. - The
client systems server system 140 via acommunications network 130, utilizing, e.g., abrowser application 112 executing on theclient system 110, or a mobile application executing on theclient system 120. Thecommunications network 130 may be a public network (e.g., the Internet, a mobile communication network, or any other network capable of communicating digital data). As shown inFIG. 1 , theserver system 140 also hosts arecommendation system 144. Therecommendation system 144 may be utilized beneficially to identify and retrieve, from thedatabase 150, the job postings that are identified as of potential interest to a member represented by a member profile. Therecommendation system 144 identifies potentially relevant job postings based on respective features that represent the job postings and the member profile. These potentially relevant job postings, which may be identified off-line for each member or on-the-fly in response to a predetermined event (e.g., an explicit request from a member), are presented to the member in order of inferred relevance. The order of presentation may be determined using a learning to rank model, as described above. A learning to rank model may be trained using the training data stored in thedatabase 150 astraining data 154. The training data may be obtained automatically, as described above and also further below. Therecommendation system 144, in some embodiments, is configured to use two types of modeling techniques—learning to rank and binary classification.Example architecture 200 of a recommendation system is illustrated inFIG. 2 . - As shown in
FIG. 2 , thearchitecture 200 includes aretrieval engine 210, aranker 220, and atraining data collector 230. Theretrieval engine 210 retrieves a list of recommendedjobs 240 from adatabase 250 for a particular member profile, e.g., using a binary classifier in the form of a logistic regression model. The list of recommendedjobs 240 may be in a format {member ID (job-_posting_ID1, . . . , job_posting IDn)}, where member ID is a reference to a member profile and job_posting ID, items are references to job postings that have been determined as being potentially of interest to a member represented by the member profile in the on-linesocial network system 142 ofFIG. 1 . Theranker 220 executes a learning to rankmodel 222 with respect to the list of recommendedjobs 240 to generate a respective rank score for each item in the list. The learning to rankmodel 222 may use boosted gradient decision trees as a learning to rank algorithm, where the terminal leaves in a decision tree represent relevance scores that can be attributed to a job posting with respect to a member profile. A rank score for an item in the list is calculated as the sum of rank scores determined for each of the decision trees, as shown in diagram 300 ofFIG. 3 . InFIG. 3 , the thicker edges show the decision track. - Returning to
FIG. 2 , the rank scores calculated by the learning to rankmodel 222 are assigned to the items in the list of recommendedjobs 240. A list of recommended jobs with respective assignedrank scores 260 is provided to thetraining data collector 230. Thetraining data collector 230 monitors events with respect to how the member, for whom the list of recommendedjobs 240 was generated, interacts with the associated job postings and, based on the monitors interactions, assigns relevance labels to the items in the list. As explained above, a job posting that is impressed and clicked by the associated member receives a different relevance score from a relevance label assigned to a job posting that was impressed but not clicked by the associated member. A list of recommended jobs with respective assigned relevance labels 270 is provided to a repository oftraining data 280. The training data stored in thedatabase 280 is used to train the learning to rankmodel 222. As explained above, the learning to rankmodel 222 can be optimized for NDCG using the Equation (3) above, where DCGranker is calculated using the rank scores and DCGideal is calculated using the relevance labels. - As mentioned above, the
retrieval engine 210, in one example embodiment, uses a binary classifier in the form of a logistic regression model. The logistic regression model may be retrained using additional features obtained from one or more decision trees used by theranker 220. Architecture of the recommendation system may utilize a converter configured to transform s-expressions into a decision tree that may that maybe used by theranker 220 and also to convert a decision tree structure into s-expressions that can be used as additional features to train logistic regression models. Anexample architecture 500 that includes such converter is shown inFIG. 5 . - As shown in
FIG. 5 , thearchitecture 500 includes aconverter 510, a learning to rankmodel 520, atraining data repository 530, alogistic regression model 540, and arepository 560 storing member profiles, job postings, and business rules. Theconverter 510 reads the decision tree structure of a tree that may be included in the learning to rankmodel 520, and converts each path from root to a leaf into an s-expression. The s-expressions derived from decision trees of the learning to rankmodel 520 are then used astraining data 530 to train thelogistic regression model 540. Thelogistic regression model 540 is executed to producerelevance scores 550 for (member profile, job posting) pairs obtained from therepository 560. Theconverter 510 may also be configured to transform business rules that may be stored in therepository 560 as branch operators provided in the s-expression format into a decision tree that could be then included in the learning to rankmodel 520. - An
example recommendation system 144 ofFIG. 1 is illustrated inFIG. 6 .FIG. 6 is a block diagram of asystem 600 to utilize nonlinear featurization of decision trees for linear regression modeling in the context of an on-line social network data, in accordance with one example embodiment. As shown inFIG. 6 , thesystem 600 includes a learning to rankmodule 610, aconverter 620, aclassifier 630, and apresentation module 640. The learning to rankmodule 610 is configured to learn a ranking model that uses decision trees as a learning to rank algorithm. Theconverter 620 is configured to read a decision tree structure of a particular decision tree from the decision trees used in the ranking model and to convert a path from root to a leaf in the particular decision tree into an s-expression. Theclassifier 630 is configured to generate a recommended jobs list for a member profile representing a member in an on-line social network system, utilizing the s-expression as a feature in a logistic regression model. As explained above, theclassifier 630 may use the s-expression as an additional non-linear feature in retraining the logistic regression model. In some embodiments, theclassifier 630 uses the s-expression as an additional non-linear feature in calculating a relevance score for a (member profile, job posting) pair. For example, the sigmoid function used by theclassifier 630 to calculate relevance scores may be modified to incorporate the s-expression as an additional non-linear feature. The items in the recommended jobs list generated by theclassifier 630 are references to job postings from a plurality of job postings maintained in the on-linesocial network system 142 ofFIG. 1 . Thepresentation module 640 is configured to cause items from the recommended jobs list to be presented on a display device of a member represented by the member profile in an on-line social network system. - The
converter 620 may be further configured to access one or more business rules stored in the form of s-expressions, construct a decision tree based on these business rules stored in the form of s-expressions, and include the decision tree into the ranking model used by the learning to rankmodule 610. A business may be, e.g., related to a job title represented by a feature from a member profile maintained in the on-linesocial network system 142. Some operations performed by thesystem 600 may be described with reference toFIG. 7 . -
FIG. 7 is a flow chart of amethod 700 to utilize nonlinear featurization of decision trees for linear regression modeling in the context of an on-line social network data to a social network member, according to one example embodiment. Themethod 700 may be performed by processing logic that may comprise hardware (e.g., dedicated logic, programmable logic, microcode, etc.), software (such as run on a general purpose computer system or a dedicated machine), or a combination of both. In one example embodiment, the processing logic resides at theserver system 140 ofFIG. 1 and, specifically, at thesystem 600 shown inFIG. 6 . - As shown in
FIG. 7 , themethod 700 commences atoperation 710, when the learning to rankmodule 610 learns a ranking model that uses decision trees as a learning to rank algorithm. Theconverter 620 reads a decision tree structure of a particular decision tree from the decision trees used in the ranking model atoperation 720 and converts a path from root to a leaf in the particular decision tree into an s-expression atoperation 730. Theclassifier 630 generates a recommended jobs list for a member profile representing a member in an on-line social network system, utilizing the s-expression as a feature in a logistic regression model, atoperation 740. Atoperation 750, thepresentation module 640 causes items from the recommended jobs list to be presented on a display device of a member represented by the member profile in an on-line social network system. - The various operations of example methods described herein may be performed, at least partially, by one or more processors that are temporarily configured (e.g., by software) or permanently configured to perform the relevant operations. Whether temporarily or permanently configured, such processors may constitute processor-implemented modules that operate to perform one or more operations or functions. The modules referred to herein may, in some example embodiments, comprise processor-implemented modules.
- Similarly, the methods described herein may be at least partially processor-implemented. For example, at least some of the operations of a method may be performed by one or more processors or processor-implemented modules. The performance of certain of the operations may be distributed among the one or more processors, not only residing within a single machine, but deployed across a number of machines. In some example embodiments, the processor or processors may be located in a single location (e.g., within a home environment, an office environment or as a server farm), while in other embodiments the processors may be distributed across a number of locations.
-
FIG. 8 is a diagrammatic representation of a machine in the example form of acomputer system 800 within which a set of instructions, for causing the machine to perform any one or more of the methodologies discussed herein, may be executed. In alternative embodiments, the machine operates as a stand-alone device or may be connected (e.g., networked) to other machines. In a networked deployment, the machine may operate in the capacity of a server or a client machine in a server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment. The machine may be a personal computer (PC), a tablet PC, a set-top box (STB), a Personal Digital Assistant (PDA), a cellular telephone, a web appliance, a network router, switch or bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine. Further, while only a single machine is illustrated, the term “machine” shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein. - The
example computer system 800 includes a processor 802 (e.g., a central processing unit (CPU), a graphics processing unit (GPU) or both), amain memory 804 and astatic memory 806, which communicate with each other via abus 808. Thecomputer system 800 may further include a video display unit 810 (e.g., a liquid crystal display (LCD) or a cathode ray tube (CRT)). Thecomputer system 800 also includes an alpha-numeric input device 812 (e.g., a keyboard), a user interface (UI) navigation device 814 (e.g., a cursor control device), adisk drive unit 816, a signal generation device 818 (e.g., a speaker) and anetwork interface device 820. - The
disk drive unit 816 includes a machine-readable medium 822 on which is stored one or more sets of instructions and data structures (e.g., software 824) embodying or utilized by any one or more of the methodologies or functions described herein. Thesoftware 824 may also reside, completely or at least partially, within themain memory 804 and/or within theprocessor 802 during execution thereof by thecomputer system 800, with themain memory 804 and theprocessor 802 also constituting machine-readable media. - The
software 824 may further be transmitted or received over anetwork 826 via thenetwork interface device 820 utilizing any one of a number of well-known transfer protocols (e.g., Hyper Text Transfer Protocol (HTTP)). - While the machine-
readable medium 822 is shown in an example embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions. The term “machine-readable medium” shall also be taken to include any medium that is capable of storing and encoding a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of embodiments of the present invention, or that is capable of storing and encoding data structures utilized by or associated with such a set of instructions. The term “machine-readable medium” shall accordingly be taken to include, but not be limited to, solid-state memories, optical and magnetic media. Such media may also include, without limitation, hard disks, floppy disks, flash memory cards, digital video disks, random access memory (RAMs), read only memory (ROMs), and the like. - The embodiments described herein may be implemented in an operating environment comprising software installed on a computer, in hardware, or in a combination of software and hardware. Such embodiments of the inventive subject matter may be referred to herein, individually or collectively, by the term “invention” merely for convenience and without intending to voluntarily limit the scope of this application to any single invention or inventive concept if more than one is, in fact, disclosed.
- Certain embodiments are described herein as including logic or a number of components, modules, or mechanisms. Modules may constitute either software modules (e.g., code embodied (1) on a non-transitory machine-readable medium or (2) in a transmission signal) or hardware-implemented modules. A hardware-implemented module is tangible unit capable of performing certain operations and may be configured or arranged in a certain manner. In example embodiments, one or more computer systems (e.g., a standalone, client or server computer system) or one or more processors may be configured by software (e.g., an application or application portion) as a hardware-implemented module that operates to perform certain operations as described herein.
- In various embodiments, a hardware-implemented module may be implemented mechanically or electronically. For example, a hardware-implemented module may comprise dedicated circuitry or logic that is permanently configured (e.g., as a special-purpose processor, such as a field programmable gate array (FPGA) or an application-specific integrated circuit (ASIC)) to perform certain operations. A hardware-implemented module may also comprise programmable logic or circuitry (e.g., as encompassed within a general-purpose processor or other programmable processor) that is temporarily configured by software to perform certain operations. It will be appreciated that the decision to implement a hardware-implemented module mechanically, in dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by cost and time considerations.
- Accordingly, the term “hardware-implemented module” should be understood to encompass a tangible entity, be that an entity that is physically constructed, permanently configured (e.g., hardwired) or temporarily or transitorily configured (e.g., programmed) to operate in a certain manner and/or to perform certain operations described herein. Considering embodiments in which hardware-implemented modules are temporarily configured (e.g., programmed), each of the hardware-implemented modules need not be configured or instantiated at any one instance in time. For example, where the hardware-implemented modules comprise a general-purpose processor configured using software, the general-purpose processor may be configured as respective different hardware-implemented modules at different times. Software may accordingly configure a processor, for example, to constitute a particular hardware-implemented module at one instance of time and to constitute a different hardware-implemented module at a different instance of time.
- Hardware-implemented modules can provide information to, and receive information from, other hardware-implemented modules. Accordingly, the described hardware-implemented modules may be regarded as being communicatively coupled. Where multiple of such hardware-implemented modules exist contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) that connect the hardware-implemented modules. In embodiments in which multiple hardware-implemented modules are configured or instantiated at different times, communications between such hardware-implemented modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple hardware-implemented modules have access. For example, one hardware-implemented module may perform an operation, and store the output of that operation in a memory device to which it is communicatively coupled. A further hardware-implemented module may then, at a later time, access the memory device to retrieve and process the stored output. Hardware-implemented modules may also initiate communications with input or output devices, and can operate on a resource (e.g., a collection of information).
- The various operations of example methods described herein may be performed, at least partially, by one or more processors that are temporarily configured (e.g., by software) or permanently configured to perform the relevant operations. Whether temporarily or permanently configured, such processors may constitute processor-implemented modules that operate to perform one or more operations or functions. The modules referred to herein may, in some example embodiments, comprise processor-implemented modules.
- Similarly, the methods described herein may be at least partially processor-implemented. For example, at least some of the operations of a method may be performed by one or processors or processor-implemented modules. The performance of certain of the operations may be distributed among the one or more processors, not only residing within a single machine, but deployed across a number of machines. In some example embodiments, the processor or processors may be located in a single location (e.g., within a home environment, an office environment or as a server farm), while in other embodiments the processors may be distributed across a number of locations.
- The one or more processors may also operate to support performance of the relevant operations in a “cloud computing” environment or as a “software as a service” (SaaS). For example, at least some of the operations may be performed by a group of computers (as examples of machines including processors), these operations being accessible via a network (e.g., the Internet) and via one or more appropriate interfaces (e.g., Application Program Interfaces (APIs).)
- Thus, nonlinear featurization of decision trees for linear regression modeling in the context of an on-line social network has been described. Although embodiments have been described with reference to specific example embodiments, it will be evident that various modifications and changes may be made to these embodiments without departing from the broader scope of the inventive subject matter. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense.
Claims (20)
1. A computer-implemented method comprising:
constructing a particular decision tree to determine a ranking score using respective features from a pair comprising a member profile representing a member in an on-line social network system and a job posting, the particular decision tree comprising a node to compare to a threshold value a value representing similarity between a feature of the member profile and a feature of the job posting;
learning a ranking model, the ranking model using decision trees as a learning to rank algorithm, the decision trees comprising the particular decision tree;
reading a decision tree structure of the particular decision tree;
converting, using at least one processor, a path from root to a leaf in the particular decision tree into an s-expression, the format of the s-expression representing a nested if then else statement:
retraining a logistic regression model utilizing the s-expression as an additional feature;
using the logistic regression model, generating, a recommended jobs list for a member profile representing a member in an on-line social network system using at least one processor; and
causing items from the recommended jobs list to be presented on a display device of a member represented by the member profile in an on-line social network system.
2. The method of claim 1 , wherein items in the recommended jobs list are references to job postings from a plurality of job postings maintained in the on-line social network system.
3. (canceled)
4. The method of claim 1 , wherein the utilizing of the s-expression by the logistic regression model comprises using the s-expression as an additional non-linear feature in calculating a relevance score for a (member profile, job posting) pair.
5. The method of claim 4 , wherein the calculating of a relevance score for a (member profile, job posting) pair comprises using sigmoid function.
6. The method of claim 5 , wherein the using of the s-expression as an additional non-linear feature in calculating a relevance score for a (member profile, job posting) pair comprises modifying the sigmoid function to incorporate the s-expression as an additional non-linear feature.
7. The method of claim 1 , comprising:
accessing one or more further s-expressions, the one or more further s-expressions representing one or more business rules;
constructing a decision tree based on the further s-expressions; and
including the decision tree into the ranking model.
8. The method of claim 7 , wherein a business rule from the one or more business rules is related to a job title represented by a feature from a member profile maintained in the on-line social network system.
9. The method of claim 7 , comprising storing the one or more business rules in a database associated with the on-line social network system.
10. The method of claim 1 , wherein the on-line social network system is a professional on-line network system.
11. A computer-implemented system comprising:
a learning to rank module, implemented using at least one processor, to:
construct a particular decision tree to determine a ranking score using respective features from a pair comprising a member profile representing a member in an on-line social network system and a job posting, the particular decision tree comprising a node to compare to a threshold value a value representing similarity between a feature of the member profile and a feature of the job posting:
learn a ranking model, the ranking model using decision trees as a learning to rank algorithm, the decision trees comprising the particular decision tree;
a converter, implemented using at least one processor, to:
read a decision tree structure of the particular decision tree, and convert a path from root to a leaf in the particular decision tree into an s-expression;
a classifier, implemented using at least one processor, to generate a recommended jobs list, for a member profile representing a member in an on-line social network system, utilizing the s-expression as a feature in a logistic regression model; and
a presentation module, implemented using at least one processor, to cause items from the recommended jobs list to be presented on a display device of a member represented by the member profile in an on-line social network system.
12. The system of claim 11 , wherein items in the recommended jobs list are references to job postings from a plurality of job postings maintained in the on-line social network system.
13. The system of claim 11 , wherein the classifier is to use the s-expression as an additional non-linear feature in retraining the logistic regression model.
14. The system of claim 11 , wherein the classifier is to use the s-expression as an additional non-linear feature in calculating a relevance score for a (member profile, job posting) pair.
15. The system of claim 14 , wherein the classifier is to use sigmoid function to calculate a relevance score for a (member profile, job posting) pair.
16. The system of claim 15 , wherein the signal function is modified to incorporate the s-expression as an additional non-linear feature.
17. The system of claim 11 , wherein the converter is to:
access one or more further s-expressions, the one or more further s-expressions representing one or more business rules;
construct a decision tree based on the further s-expressions; and
include the decision tree into the ranking model.
18. The system of claim 17 , wherein a business rule from the one or more business rules is related to a job title represented by a feature from a member profile maintained in the on-line social network system.
19. The system of claim 17 , wherein the one or more business rules are stored in a database associated with the on-line social network system.
20. A machine-readable non-transitory storage medium having instruction data executable by a machine to cause the machine to perform operations comprising:
constructing a particular decision tree to determine a ranking score using respective features from a pair comprising a member profile representing a member in an on-line social network system and a job posting, the particular decision tree comprising a node to compare to a threshold value a value representing similarity between a feature of the member profile and a feature of the job posting:
learning a ranking model, the ranking model using decision trees as a learning to rank algorithm. the decision trees comprising the particular decision tree:
reading a decision tree structure of the particular decision tree;
converting a path from root to a leaf in the particular decision tree into an s-expression, retraining a logistic regression model utilizing the s-expression as an additional feature;
using the logistic regression model, generating a recommended jobs list for a member profile representing a member in an on-line social network system; and
causing items from the recommended jobs list to be presented on a display device of a member represented by the member profile in an on-line social network system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/788,717 US20170004455A1 (en) | 2015-06-30 | 2015-06-30 | Nonlinear featurization of decision trees for linear regression modeling |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/788,717 US20170004455A1 (en) | 2015-06-30 | 2015-06-30 | Nonlinear featurization of decision trees for linear regression modeling |
Publications (1)
Publication Number | Publication Date |
---|---|
US20170004455A1 true US20170004455A1 (en) | 2017-01-05 |
Family
ID=57683242
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/788,717 Abandoned US20170004455A1 (en) | 2015-06-30 | 2015-06-30 | Nonlinear featurization of decision trees for linear regression modeling |
Country Status (1)
Country | Link |
---|---|
US (1) | US20170004455A1 (en) |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170034313A1 (en) * | 2015-07-31 | 2017-02-02 | Linkedln Corporation | Organizational directory access client and server leveraging local and network search |
US20180247271A1 (en) * | 2017-02-28 | 2018-08-30 | Linkedln Corporation | Value of content relevance through search engine optimization |
US20180253658A1 (en) * | 2017-03-01 | 2018-09-06 | Microsoft Technology Licensing, Llc | Understanding business insights and deep-dive using artificial intelligence |
US20180330331A1 (en) * | 2017-05-10 | 2018-11-15 | Accenture Global Solutions Limited | Processing relationally mapped data to generate contextual recommendations |
WO2019023364A1 (en) * | 2017-07-26 | 2019-01-31 | Microsoft Technology Licensing, Llc | Job applicant probability of confirmed hire |
CN109409516A (en) * | 2017-08-11 | 2019-03-01 | 微软技术许可有限责任公司 | Machine learning model for the tool depth and width that position is recommended |
US10366093B2 (en) * | 2016-05-11 | 2019-07-30 | Baidu Online Network Technology (Beijing) Co., Ltd | Query result bottom retrieval method and apparatus |
CN111104307A (en) * | 2019-10-23 | 2020-05-05 | 广州市智能软件产业研究院 | Decision tree-based parameter-carrying protocol verification method |
US10891295B2 (en) * | 2017-06-04 | 2021-01-12 | Apple Inc. | Methods and systems using linear expressions for machine learning models to rank search results |
US20220035065A1 (en) * | 2018-09-28 | 2022-02-03 | Schlumberger Technology Corporation | Elastic adaptive downhole acquisition system |
US20230101339A1 (en) * | 2021-09-27 | 2023-03-30 | International Business Machines Corporation | Automatic response prediction |
US20230230037A1 (en) * | 2022-01-20 | 2023-07-20 | Dell Products L.P. | Explainable candidate screening classification for fairness and diversity |
US20230252418A1 (en) * | 2022-02-09 | 2023-08-10 | My Job Matcher, Inc. D/B/A Job.Com | Apparatus for classifying candidates to postings and a method for its use |
US11783125B2 (en) * | 2020-05-27 | 2023-10-10 | Capital One Services, Llc | System and method for electronic text analysis and contextual feedback |
US11947069B2 (en) | 2018-05-15 | 2024-04-02 | Schlumberger Technology Corporation | Adaptive downhole acquisition system |
US12063244B1 (en) * | 2022-07-18 | 2024-08-13 | Trend Micro Incorporated | Protecting computers from malicious distributed configuration profiles |
US20240427781A1 (en) * | 2023-06-20 | 2024-12-26 | Digiwin Software Co., Ltd | Label architecture building system and label architecture building method |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150262064A1 (en) * | 2014-03-17 | 2015-09-17 | Microsoft Corporation | Parallel decision tree processor architecture |
-
2015
- 2015-06-30 US US14/788,717 patent/US20170004455A1/en not_active Abandoned
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150262064A1 (en) * | 2014-03-17 | 2015-09-17 | Microsoft Corporation | Parallel decision tree processor architecture |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170034313A1 (en) * | 2015-07-31 | 2017-02-02 | Linkedln Corporation | Organizational directory access client and server leveraging local and network search |
US9961166B2 (en) * | 2015-07-31 | 2018-05-01 | Microsoft Technology Licensing, Llc | Organizational directory access client and server leveraging local and network search |
US10366093B2 (en) * | 2016-05-11 | 2019-07-30 | Baidu Online Network Technology (Beijing) Co., Ltd | Query result bottom retrieval method and apparatus |
US20180247271A1 (en) * | 2017-02-28 | 2018-08-30 | Linkedln Corporation | Value of content relevance through search engine optimization |
US20180253658A1 (en) * | 2017-03-01 | 2018-09-06 | Microsoft Technology Licensing, Llc | Understanding business insights and deep-dive using artificial intelligence |
US11037251B2 (en) * | 2017-03-01 | 2021-06-15 | Microsoft Technology Licensing, Llc | Understanding business insights and deep-dive using artificial intelligence |
US20180330331A1 (en) * | 2017-05-10 | 2018-11-15 | Accenture Global Solutions Limited | Processing relationally mapped data to generate contextual recommendations |
US10891295B2 (en) * | 2017-06-04 | 2021-01-12 | Apple Inc. | Methods and systems using linear expressions for machine learning models to rank search results |
WO2019023364A1 (en) * | 2017-07-26 | 2019-01-31 | Microsoft Technology Licensing, Llc | Job applicant probability of confirmed hire |
US20190034883A1 (en) * | 2017-07-26 | 2019-01-31 | Microsoft Technology Licensing, Llc | Job applicant probability of confirmed hire |
US11132645B2 (en) * | 2017-07-26 | 2021-09-28 | Microsoft Technology Licensing, Llc | Job applicant probability of confirmed hire |
CN110914846A (en) * | 2017-07-26 | 2020-03-24 | 微软技术许可有限责任公司 | Probability of job seeker confirming employment |
US10990899B2 (en) * | 2017-08-11 | 2021-04-27 | Microsoft Technology Licensing, Llc | Deep and wide machine learned model for job recommendation |
CN109409516A (en) * | 2017-08-11 | 2019-03-01 | 微软技术许可有限责任公司 | Machine learning model for the tool depth and width that position is recommended |
US11947069B2 (en) | 2018-05-15 | 2024-04-02 | Schlumberger Technology Corporation | Adaptive downhole acquisition system |
US20220035065A1 (en) * | 2018-09-28 | 2022-02-03 | Schlumberger Technology Corporation | Elastic adaptive downhole acquisition system |
US11828900B2 (en) * | 2018-09-28 | 2023-11-28 | Schlumberger Technology Corporation | Elastic adaptive downhole acquisition system |
CN111104307A (en) * | 2019-10-23 | 2020-05-05 | 广州市智能软件产业研究院 | Decision tree-based parameter-carrying protocol verification method |
US11783125B2 (en) * | 2020-05-27 | 2023-10-10 | Capital One Services, Llc | System and method for electronic text analysis and contextual feedback |
US20230101339A1 (en) * | 2021-09-27 | 2023-03-30 | International Business Machines Corporation | Automatic response prediction |
US20230230037A1 (en) * | 2022-01-20 | 2023-07-20 | Dell Products L.P. | Explainable candidate screening classification for fairness and diversity |
US20230252418A1 (en) * | 2022-02-09 | 2023-08-10 | My Job Matcher, Inc. D/B/A Job.Com | Apparatus for classifying candidates to postings and a method for its use |
US12063244B1 (en) * | 2022-07-18 | 2024-08-13 | Trend Micro Incorporated | Protecting computers from malicious distributed configuration profiles |
US20240427781A1 (en) * | 2023-06-20 | 2024-12-26 | Digiwin Software Co., Ltd | Label architecture building system and label architecture building method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9626654B2 (en) | Learning a ranking model using interactions of a user with a jobs list | |
US20170004455A1 (en) | Nonlinear featurization of decision trees for linear regression modeling | |
US10324937B2 (en) | Using combined coefficients for viral action optimization in an on-line social network | |
US20180137589A1 (en) | Contextual personalized list of recommended courses | |
US9959353B2 (en) | Determining a company rank utilizing on-line social network data | |
US10042944B2 (en) | Suggested keywords | |
US10602226B2 (en) | Ranking carousels of on-line recommendations of videos | |
US9727654B2 (en) | Suggested keywords | |
US20180144305A1 (en) | Personalized contextual recommendation of member profiles | |
US20180046986A1 (en) | Job referral system | |
US20180308057A1 (en) | Joint optimization and assignment of member profiles | |
US10552428B2 (en) | First pass ranker calibration for news feed ranking | |
US20160217139A1 (en) | Determining a preferred list length for school ranking | |
US20180253695A1 (en) | Generating job recommendations using job posting similarity | |
US20180253694A1 (en) | Generating job recommendations using member profile similarity | |
US20170193452A1 (en) | Job referral system | |
US20180137588A1 (en) | Contextual personalized list of recommended courses | |
US20150331879A1 (en) | Suggested keywords | |
US20180137587A1 (en) | Contextual personalized list of recommended courses | |
US20180089779A1 (en) | Skill-based ranking of electronic courses | |
US20180089170A1 (en) | Skills detector system | |
US20180039944A1 (en) | Job referral system | |
US20160217540A1 (en) | Determining a school rank utilizing perturbed data sets | |
US20200175455A1 (en) | Classification of skills | |
US20190188325A1 (en) | Determining connection suggestions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: LINKEDIN CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:TANG, LIJUN;HUANG, ERIC;MIAO, XU;AND OTHERS;SIGNING DATES FROM 20150626 TO 20150629;REEL/FRAME:036739/0789 |
|
AS | Assignment |
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LINKEDIN CORPORATION;REEL/FRAME:044746/0001 Effective date: 20171018 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |