System and method for listing data acquisition

Information

  • Patent Grant
  • 8135704
  • Patent Number
    8,135,704
  • Date Filed
    Saturday, March 11, 2006
    18 years ago
  • Date Issued
    Tuesday, March 13, 2012
    12 years ago
Abstract
A method and system of data acquisition by a listing service provider is disclosed. A network address is received from a client computer that is operated by a lister. The network address can be indicative of a location of listing data on a computer network. The listing data comprises at least one information item provided by the lister. The network address received from the lister is accessed by opening a computer network connection to retrieve the listing data. The lister makes available the listing data for retrieval so that the listing data can be posted in a search bank hosted by the listing service provider. The listing data is retrieved from the network address using the computer network connection by copying the listing data onto a listing data database.
Description
BACKGROUND

1. Field of the Disclosure


The present disclosure relates to listing services. In particular, it relates to systems and methods of data acquisition for listing service providers.


2. General Background


In the current generation of listing services, listing service providers use World Wide Web (web) crawlers to visit and pick-up billions of listings from every accessible source on the Internet. Typically, upon the listing service provider acquiring the listings through web crawling, the listings are processed, indexed into search banks, and published on an Internet website. Listing service providers are, for example, auction websites listing auctioned items, job databanks listing employment opportunities and openings, real estate listing companies, among others. End users, seeking specific items, are provided with an easy web search capability that allows them to extract relevant information that addresses their needs.


In addition, listers possess listing information to be conveyed to end users. Thus, for example, a lister can be any entity that sells or leases items, or provides services. Examples of listers include vehicle manufacturers and resellers, home owners, employers, etc.


Listing service providers collect data from multiple listers, classify the collected data and list the data in a searchable form so as to provide an easy to use interface for users to find the items matching the criteria of the user. Listing service providers rely heavily on the web crawlers to gather information. Listers publish listings on Internet sites utilizing different methodologies, standards, and data formats. As such, listing information obtained by web crawlers is generally unstructured and in a non-standardized format. Therefore, undirected web crawling is not completely reliable because the methodology results in poor-quality listings. Further, filtrations systems utilized to eliminate irrelevant listings obtained through undirected web-crawling require computing power that can be more productively used in other processes.


SUMMARY

In one aspect, the present disclosure is directed to a method of data acquisition by a listing service provider. As an example, a network address is received from a client computer that is operated by a lister. The network address can be a uniform resource locator. The network address can be indicative of a location of listing data on a computer network. The listing data comprises at least one information item provided by the lister. The network address received from the lister is accessed by opening a computer network connection to retrieve the listing data. The lister makes available the listing data for retrieval so that the listing data can be posted in a search bank hosted by the listing service provider. The listing data is retrieved from the network address using the computer network connection by copying the listing data onto a listing data database.


In a further exemplary aspect of the method, the listing data comprises job listing or real estate listings. In addition, the listing data in the listing data database can be analyzed for conformance to predetermined quality criteria. The listing data can also be categorized into one or more predetermined categories, and further stored in the search bank hosted by the listing service provider. Moreover, selected categorized job information data can be transferred from the search bank through a job search client server to a job searcher in response to a query by the job searcher. The listing data can be posted on an Internet website.


In another aspect of the method, an Internet website can be provided wherein a lister can administrate the listing data provided to the listing service provider. In yet another aspect of the method, the retrieved listing data is in an information resource definition language which is unique to the listing service provider. The information resource definition language can be based on extensible markup language.


In another aspect of the method, the information resource definition language is an extension of extensible markup language. For example, the information resource definition language can comprise a job element. In addition, the job element can comprise an employer company attribute and a job title attribute.


In another example, the information resource definition language can comprise a real estate element. In addition, the real estate element can comprise a location attribute and a price attribute.


In one embodiment, the network address can be provided by entering a uniform resource locator in an Internet web form hosted by the listing service provider. In another embodiment, the network address can be provided by transmitting the uniform resource locator as part of a hypertext source code tag.


In another aspect, an embodiment of the disclosure is a system for data acquisition by a listing service provider comprising an address receiving module and a web crawling module. The address receiving module receives a network address from a client computer that is operated by a lister. The network address is indicative of a location of listing data on a computer network. The listing data comprises at least one information item provided by the lister. The web crawling module accesses the network address received from the lister. The network address is accessed by opening a computer network connection to retrieve the listing data. The lister makes available the listing data for retrieval so that the listing data can be posted in a search bank hosted by the listing service provider. The web crawling module is configured to retrieve the listing data from the network address using the computer network connection by copying the listing data onto a listing data database.


In yet another aspect, a computer readable medium encoding a computer program of instructions for executing a computer process for data acquisition by a listing service provider. The computer process may comprise multiple steps. A network address is received from a client computer that is operated by a lister. The network address can be indicative of a location of listing data on a computer network. The listing data comprises at least one information item provided by the lister. The network address received from the lister is accessed by opening a computer network connection to retrieve the listing data. The lister makes available the listing data for retrieval so that the listing data can be posted in a search bank hosted by the listing service provider. The listing data is retrieved from the network address using the computer network connection by copying the listing data onto a listing data database.





DRAWINGS

By way of example, reference will now be made to the accompanying drawings.



FIG. 1 illustrates a listing data acquisition system in accordance with the present disclosure.



FIG. 2 illustrates a listing data acquisition system in which a lister feeds listing data to a listing service provider over a computer network utilizing a web interface in accordance with the present disclosure.



FIG. 3A illustrates a screen shot of a web form for entering listing data in a web browser in accordance with the present disclosure.



FIG. 3B illustrates a screen shot of a web form to upload a data file at a web browser in accordance with the present disclosure.



FIG. 4 illustrates a listing data acquisition system in which a lister feeds data to a listing service provider over a computer network utilizing a computer application in accordance with the present disclosure.



FIG. 5 illustrates a listing data acquisition system in which a listing service provider acquires data from a lister using a directed web crawler in accordance with the present disclosure.



FIG. 6A illustrates a source code example for including a reference to universal resource locator in accordance with the present disclosure.



FIG. 6B illustrates a screen shot of a web form to submit a location of a listing data file in accordance with the present disclosure.



FIG. 7 illustrates a data flow diagram for a process of collection of listing data in accordance with the present disclosure.



FIG. 8 illustrates an integrated system for data acquisition and administration in accordance with the present disclosure.





DETAILED DESCRIPTION

A system and method of listing data collection and delivery is disclosed. Unlike traditional systems and methods of data acquisition, which rely on web crawling of non-standardized listing data, the system and method provided herein allows listers and listing service providers to communicate via a standard language. Thus, accurate and updated listing data can be provided to listing service providers. In addition, listers also benefit from better categorized listing data and thereby have higher chances that their listings will be easily accessed by users.


A listing service provider can utilize a directed web crawler to acquire data more accurately and efficiently. Furthermore, an Information Resource Definition (IRD) language can be utilized as a means for communication between a lister and the listing service provider.


In one embodiment, a lister can place listing data on a secure site and give to a listing service provider an indication of the location of the listing data. The listing service provider can then utilize a directed web crawler to collect the listing data at the specified location. In addition, the listing service provider can interpret the data based on standardized definitions. Therefore, a “pull” methodology may be used to acquire data at the listing manager. In this methodology, the lister provides to the listing service provider two pieces of information to find the data: the location of the data and the format of the data. Once the listing service provider receives this information, a directed web crawler can “pull” or collect the information from the specified location.


In another embodiment, listers are also provided with the capability to submit listing data to a listing service provider. Thus, a “push” methodology may be used to acquire data at the listing manager. Listers can utilize the Information Resource Definition (IRD) rules and deliver or push listing data to the listing service provider. The data delivered to the listing service provider can be formatted to comply with the IRD rules such that the listing service provider can interpret the data submitted.



FIG. 1 illustrates a listing data acquisition system 100 in accordance with the present disclosure. A lister can provide listing data to a listing service provider 101 that posts listing data on the Internet 108. In one example, the lister utilizes a lister's computer 102 to communicate with a listing manager 110 through the Internet 108. The listing manager 110 can be a computer server managed by the listing service provider 101.


In one embodiment, the lister's computer 102 can include a listings database 104 that stores listing data ready for publication. For example, an employer having job openings in its engineering division can store job listings in the listings database 104 that are later transmitted to a listing service provider 101. The lister's computer can further include a feed module 106 that retrieves listing data from the listings database 104. In one embodiment, the feed module 106 delivers the retrieved listing data to the listing manager 110. In another embodiment, the feed module 106 places the retrieved listing data in a network-accessible site for the listing manager 110 to collect through the Internet 108.


The listing manager 110 can be a computing module, which resides in a computer infrastructure of a listing service provider 101. Alternatively, the listing manager 110 can be a computer server that resides in a computer infrastructure of a listing service provider 101. For example, a job listing service provider 101 can utilize a computer infrastructure to post all available job listings on the Internet 108. The listing manager 110 can reside in a computer server connected to the Internet 108. The listing data can be acquired by either requesting the data from the feed module 106, scraping the data published on the Internet 108 by the feed module 106, or by simply receiving the listing data submitted by the feed module 106.


In addition, once the listing information is acquired from the lister's computer 102, the listing manager 110 can provide the listing information to a listing server 112, which in turn publishes, or otherwise makes available, the listing information on the Internet 108. The listing server 112 can be for example, a web server, an ftp server, or any other server configured to post information on the Internet 108 for user viewing and searching.


Once published and listed, the listing data is available for users to view and search the listing data at a user computing device 114. The user computing device 114 can be any personal computer, a handheld device, etc., that can access the Internet. Upon sending a request, the user computing device 114 receives listing information posted by the listing server 112. In one embodiment, the user computing device 114 can receive the listing data either on a request to the listing server 112, wherein the user computing device 114 includes a web browser and requests listing data from the listing server 112. In another embodiment, the user computing device 114 receives the listing data based on a transmission by the listing server 112, wherein the transmission is initiated by the listing server 112. In one example, the user computing device 114 receives a Really Simple Syndication (RSS) feed. In another example, the user computing device 114 receives a podcast.


As previously mentioned, listers are provided the opportunity to render accurate listing data to a listing service provider 101 by using a common standardized format. The lister computer 102 allows a lister to transmit listing data to a listing service by “pushing” the relevant listing data to the listing manager 110 through the Internet 108. In addition, the lister computer 102 allows a lister to transmit the location of the listing data to a listing service so that the listing service can “pull” the relevant listing data and process the listing data in the listing manager 110.


Push Methodology


Listers can transmit listing data to a listing service provider 101 at any time by directly feeding the listing data. Therefore, the listing service provider 101 does not have to web crawl the Internet to acquire such listing information. In addition, the listing data that is provided to the lister can be in a standard format.


The lister can “push” the data to the listing service provider 101 through interfaces provided by the listing service provider 101. In one example, the listing service provider 101 makes available a website that has data fields for entering listing data. In another example, the listing service provider 101 makes available a file upload site in which a lister can provide the name of a file in a pre-specified format for uploading. In yet another example, the listing service provider 101 provides application programming interface (APIs) functions for the lister to develop a computer application for distributing the data to the listing service provider 101.



FIG. 2 illustrates a listing data acquisition system 200 in which a lister feeds listing data to a listing service provider 201 over a computer network utilizing a web interface in accordance with the present disclosure. The computer network may be for example Internet 108. A lister's computer 202 can connect to a feed web server 210 to provide listing data. For example, a user at the first lister's computer 202 can utilize a web browser 206 to access a web page hosted by the feed web server 210.


In one embodiment, the web form provided by the feed web server 210 is a form having one or more fields where data can be entered. The web browser 206 can display the forms hosted by the feed web server 210 and allow the user to enter listing data in the forms.


In another embodiment, the web interface provided by the feed web server 210 allows a lister to designate a file containing listing data for uploading. Thus, the lister at the lister's computer 202 can provide a file path indicative of the location of the file to be uploaded. Files of various formats can be uploaded and later parsed by the file interpretation module 208. In one example, the file is an Excel spreadsheet file. In another example, the file is an Extensible Markup Language (XML) file. In yet another example, the file is a Human Resources Extensible Markup Language (XML) file. In another example, the file is a Resource Description Framework (RDF) file.


Once the file has been interpreted by the file interpretation module 208, the feed web server 210 can transmit the listing data to the listing manager 110 for processing, categorization, sanitation of data, data format check, regular expression check, etc. In addition, a fraud pre-filtration process can be performed in order to verify that the data provided is current, relevant, and non-fraudulent. Categories for fraud filtration of data may include offensive listings, illegal listings, irrelevant listings, etc. Techniques utilized in automatic categorization of listing data, including job listings, are described in detail in the U.S. patent application Ser. No 10/920,588, filed August, 2004, and entitled Automatic Product Categorization, assigned to the assigned of this disclosure.


Upon processing the listing data, the listing manager 110 can further relay the listing data to the listing server 112 for posting the listings on a website on the Internet 108. Thereafter, a user searching for listed items of interest (e.g., a job seeker, a homebuyer, etc) can access, through the user computing device 114, the listing data via the listing service provider 101.



FIG. 3A illustrates a screen shot of a web form 300 for entering listing data in a web browser in accordance with the present disclosure. The web form 300 can be provided by the listing service provider 201 and hosted at the feed web server 210. In one embodiment, the listing service provider 201 services listing of job listing information, and the web form 300 includes fields for entering a new job opening. Therefore, a lister such as an employer can enter a new job listing in the web form 300 which can later be displayed at the website of the listing service provider 201. A title field 302 can be provided to enter a title or designation for the job listing. Attribute fields 304 can be provided for entering further information regarding the listing being entered. For example, for a job listing, attributes that can be entered in the attribute fields 304 include company, experience, salary, degree, start date, end date, etc. Furthermore, the lister entering data, such as the employer, can configure additional attributes in association with the listing being entered. For example, if the job lister wishes to entered another attribute, such as a field for 401K benefits, the job lister can add a retirement benefits field by selecting an add function 306. In addition, a keyword field 308 and a description field 310 can also be provided to the lister in order to further qualify the listed job.



FIG. 3B illustrates a screen shot of an exemplary web form 312 that is used to upload a data file via a web browser in accordance with the present disclosure. As described above, a listing service provider 201 can further allow listers to upload data files containing listing data. The web form 312 can include a file path data field 314 to enter the location of the file to be uploaded.


In one embodiment, a lister may be a frequent user. The lister can be provided with a preconfigured or user defined a username for identification. The lister can upload a file under the username. The feed web server 210 can parse the uploaded file and further associate the listings in the file to the lister. The lister can then be provided with the ability to administrate the submitted data listings. In another embodiment, a lister as a first-time user, is not required to have a username to upload a listing data file.


The file that is uploaded by the lister can be formatted in a standard format, which is also known to the file interpretation module 208. The file interpretation module 208 can include rules and definitions for an IRD file. In one example, the information resource definition can be established as shown in Table A. Table A below shows exemplary elements and corresponding attributes of an IRD file.











TABLE A





Element
Attributes
Description







ird:rd

Identifies a new resource to be crawled by the crawler.



ird:property
The vertical identifier to place the new resource.




“hj” = “Yahoo! HotJobs”



ird:ttl
The “Time to Live” of the resource. How often this resource is




refreshed.



ird:gid
The id supplied by Yahoo! HotJobs


job:company

Company.



job:name
A Name.


job:contact

Contact information for a person or corporation.



job:name
A Name.


job:listing

A listing for a particular resource.



job:display_type
How this listing should be displayed by Yahoo!.




Premium - a fee-based listing where the resource provider is




charged to get special treatment for the listing.




Normal- Normal treatment for this listing.


job:title

Title of the Job.


job:description

Description of the Job.










job:category

Category of the Job:





Accounting_Finance
FIN




Advertising_Public_Relations
ADV




Arts_Entertainment_Publishing
ART




Banking_Mortgage
BAM




Clerical_Administrative
ADM




Construction_Facilities
CON




Customer_Service
CUS




Education_Training
EDU




Engineering_Architecture
ENG




Government
GOV




Health_Care
HEA




Hospitality_Travel
HOS




Human_Resources
HRS




Insurance
INS




Internet_New_Media
NEW




Law_Enforcement_Security
LAW




Legal
LEG




Management_Consulting
MCO




Manufacturing_Operations
MAN




Marketing
MAR




Non_Profit_Volunteer
NON




Pharmaceutical_Biotech
SCI




Real_Estate
RLE




Restaurant_Food_Service
RFS




Retail
PUR




Sales
SAL




Technology
MIS




Telecommunications
TEL




Transportation_Logistics
TRA




Work_At_Home
OTH








job:experience
Level of experience required for the job.


job:status
Full-Time or Part-Time


job:level
Level of education required. <BS, MS, PHD>


job:salary
Salary/Wage of the job.


job:preurl
If cookies are required to view the URL, this is the URL that must



be accessed to obtain the cookies.


job:url
URL for the hosting web page of the Job.


ird:location
A Location. A job listing can have multiple locations. Each



separate location of the job will be treated as a separate listing.


job:address
Location - Street address of the job.


job:city
Location - City of the Job.


job:state
Location - State of the Job.


job:country
Location- Country of the Job.









Table A shows an IRD syntax for providing job listings. In one example, the IRD syntax properties for a job listing can include company, title, description, category, experience, status, level, salary, etc. In addition, each property may include further attributes. Thus, for example, a lister that wants to post a job listing may provide the listing data in IRD format. A source code example of the contents of an IRD file that can be uploaded is as follows:














<?xml version=”1.0”?>


<rdf:RDF xmlns:rdf=”http://www.w3.org/1999/02.22-rdf-syntax-ns#”


  xmlns:ird=”http://www.yahoo.com/2005-07-21-feed-syntax”>


  xmlns:job=”http://www.hotjobs.com/2005-07-21-hj-feed-syntax”>


  <ird:rd ird:property=”hj” ird:ttl=”5 days” ird:custodian=admin@yahoo-inc.com ird:gid:xxx4783>


   <job:company job:name=”Yahoo!”>


    <job:url> www.yahoo.com </job:url>


    <ird:location>


      <job:address> 701 N. First Street </job:address>


      <job:city> Sunnyvale </job:city>


      <job:state> CA </job:state>


      <job:country> US </job:country>


    </ird:location>


    <job:contact job:name = “Adam Hyder”>


        <job:email> ah@yahoo-inc.com </job:email>


        <job:phone1> 408 349-xxxx </job:phone1>


    </job:contact>


    <job:listing job:display_type: normal>


        <job:title> Technical Yahoo! </job:title>


        <job:description> Does some technical stuff. </job:description>


        <job:category> MIS </job:category>


        <job:experience> 5+ yrs </job:experience>


        <job:status> Full-Time </job:status>


        <job:level> BS Degree </job:level>


        <job:salary> $100,000 </job:salary>


        <job:preurl> www.yahoo.com/careers/list.html </job:preurl>


        <job:url> www.yahoo.com/careers/jobdetail1.html </job:url>


        <ird:location>


          <job:address> 701 N. First Street </job:address>


          <job:city> Sunnyvale </job:city>


          <job:state> CA </job:state>


          <job:country> US </job:country>


        </ird:location>


    </job:listing>


    <job:listing job:display_type: premium>


        <job:title> Sr. Technical Yahoo! </job:title>


        <job:description> Does some very technical stuff. </job:description>


        <job:category> MIS </job:category>


        <job:experience> 10+ yrs </job:experience>


        <job:status> Full-Time </job:status>


        <job:level> BS Degree </job:level>


        <job:salary> $120,000 </job:salary>


        <job:preurl> www.yahoo.com/careers/list.html </job:preurl>


        <job:url> www.yahoo.com/careers/jobdetail1.html </job:url>


        <ird:location>


          <job:address> 701 N. First Street </job:address>


          <job:city> Sunnyvale </job:city>


          <job:state> CA </job:state>


          <job:country> US </job:country>


        </ird:location>


    </job:listing>


   </job:company>


  </ird:rd>


</rdf:RDF>










FIG. 4 illustrates a listing data acquisition system 400 in which a lister feeds data to a listing service provider 401 over a computer network utilizing a computer application 406 in accordance with the present disclosure. In one example, a lister's computer 402 can be provided with application program interface (API) commands in an application development kit. The application development kit includes API commands for a specific format, message types, etc., that the web service server 402 is able to interpret. API commands can include for example, commands for transmission of data, commands for acknowledgment of data received, commands for encryption/decryption of data, and other functions indicating the type of data being transmitted, etc. The web service server 402 either accepts or rejects data transmitted from one or more client computers via the Internet 108 depending on whether the API is calls are correctly invoked.


A lister can utilize the APIs to include interfacing commands in a lister application 406 in order to interface automatically and directly with the web service server 402. As such, the lister can program existing applications or develop a new application that can retrieve listing data from a local listing database 404, and transmit such listing data to the web service server 402. For example, a lister can develop the computer application 406 to periodically retrieve listing data from the listing database 404 and transmit the data to the web service server 402. In another example, the lister can simply incorporate the APIs provided by the listing service provider 401 into existing applications, and make calls to the provided APIs in order to submit the data to the listing service provider 401. In yet another example, the lister can develop an application for data entry and submission to the web service server 402.


In another embodiment, the application development kit provides APIs that encode data in an IRD format for transmission to the web service server 402. As such, the web service server 402 can include, or be interfaced with, a file interpretation module 208 as illustrated in FIG. 2. The lister application 406 can be implemented to package listing data in IRD format for transmission to the web service server 402. The implementation of the lister application 406 can include API calls for encoding the data to IRD format and for transmitting the listing data directly to the file interpretation module 208.


The file interpretation module 208 can reside at the web service server 402, or any other server shared by other processes in the computer configuration of the listing service provider 401. After the web service module 218 parses and interprets the listing data received, the web service module 210 transmits the data to the listing manger 110 for further processing, categorization, etc. Here again, the listing data is then published on a web site by the listing server 112 and then later viewed at a user computing device 114.


Pull Methodology


A data acquisition system can be used by a listing service provider in order to enhance the data acquisition model such that the data is gathered from specific Internet locations. In other words, listers can provide the location of the data to the listing service provider, which, in turn, collects or “pulls” the data from the network location provided by the lister. Once the listing service provider is provided with information regarding the location of the listing data, the listing service provider can initiate a directed web crawler to collect the data at the specific network location provided by the lister.


A directed web crawler can be a web crawler having a predetermined location on the Internet. Unlike traditional web crawlers, which crawl travel around the Internet without a specific destination a directed web crawler can collect the information from a specific location with the assurance that useful and meaningful data can be found at each visited web site.



FIG. 5 illustrates a listing data acquisition system 500 in which a listing service provider 501 acquires data from a lister using a directed web crawler in accordance with the present disclosure. A listing service provider 501 can utilize a lister's computer 502 communicably connected to the Internet 108 to provide a destination location to a web crawler module 510. The destination location can be a universal resource locator (URL) that references an ftp site, web site, or any other electronically accessible storing module. The lister's computer 502 can provide a destination location via a location feeding module 506. The listing service provider 501 can receive the listing data location at a listing data collection server 514. In particular, the listing data location can be received at a location receipt module 512. In one example, the location feeding module 506 can be a web browser wherein a user can enter a URL path. In another example, the location feeding module 506 provides markup language source code including a reference to the URL path. In yet another embodiment, the location feeding module 506 can be a separate process that connects directly to the location receipt module 512.


Furthermore, the location feeding module 506 can permit a lister to provide additional information to the location receipt module 510. For instance, the location feeding module 506 can permit the lister to schedule data collection times. In another example, the lister can set the frequency of data collections. Furthermore, the location feeding module 506 can be configured such that higher data collection frequency would entail a charge to the lister on a per-listing basis. Once the location of the listing data is received at the location receipt module 512, the web crawler module 510 can be activated to web crawl and collect the data from the indicated location. The crawling process can be activated periodically (e.g. every fifteen minutes), or with a frequency indicated by the user.


In addition, the lister's computer 502 can include a listings database 504 that stores the listing data that the lister wishes to post to the Internet. In one embodiment, the listing database 504 is a single file. In another embodiment, the listing database 504 is a collection of files. The listings database 604 can be partially accessible via the network location supplied to the location receipt module 512. Alternatively, the listings database 604 can be fully accessible. Furthermore, the location of the listings database 604 can be in one or distributed across multiple sites or locations.


The listings database 504 may comprise a text file, which can be collected by the web crawler module 512. The text file can include the listing data in a data format that is recognized by the file interpretation module 208. In addition, the text file can include listing data in a data format that is specific to the listing data collection server 514 such that only the file interpretation module 208 can fully interpret the data. As such, the file presented can be in an Excel, HR-XML, RDF, or other format. In addition, the file presented can also be in IRD format, as discussed above.


Once the file is collected, the file interpretation module 208 interprets the data based on the IRD definitions, and provides the data to the listing manager 110. As stated above, the listing manger 110 can provide data cleansing, removal of duplicate entries, quality checking, categorization, etc.



FIG. 6A illustrates a source code example for including a reference to URL in accordance with the present disclosure. The universal resource locator can reference to an IRD file. In one example, the web crawler module 510, as part of collecting data at specific locations that have been previously specified, can also search Internet sites for source code indicating new listing data files. In another example, a separate crawling process can be used with the single task of retrieving URL addresses, which are later visited by the web crawler module 510 for collection of the listing data file.


In one embodiment, the source code with the resource URL can be embedded in an existing HTML source code of the lister's web pages. Regular HTML readers and parsers can disregard the data in the file. Specialized web crawling processes however, can be configured to recognize tags associated with the resource URL and retrieve information related to the resource URL.


In another embodiment, the source code can be in a separate file in a designated directory (e.g., root directory of the lister's site). Thus, for example, at the top level of the site of the lister, a pre-established name that indicates the presence of a resource URL can be used (e.g. resource.xml). Each resource URL can be further identified by syntax unique to a standard language, such as IRD. Therefore, if the resource URL references an IRD file, then the IRD file can contain properties, attributes, and other elements specific to the IRD language.



FIG. 6B illustrates a screen shot of a web form 602 to submit a location of a listing data file in accordance with the present disclosure. As described above, a listing service provider 501 can further allow listers to enter a uniform resource locator address in a web form field 604. The URL can point to files in various formats. As previously discussed, one such file format can be the IRD file format.


In addition, a lister can be a frequent user provided with a username for identification. The lister can then enter the username at username field 606. The lister can transmit a URL under a preconfigured username. Upon collecting the listing data at the URL, the web crawler module 510 can relay the collected listing data to the listing data collection server 514 for further processing which includes associating the data listings in the file to the lister. The lister can then be provided with the ability administrate the submitted data listings. In another embodiment, a lister is a first-time user that is not required to have a username to upload a listing data file.



FIG. 7 illustrates a flow diagram for a listing data collection process in accordance with the present disclosure. A process 700 can be provided for data acquisition via directed crawling. The process begins at process block 702. At process block 704, the listing data location (e.g. a universal resource locator) is provided. In one embodiment, the listing data locating can be transmitted from the location feeding module 506 to a location receipt module 512. For example, a URL can be transmitted to the location receipt module 512 through a web form that is hosted by the listing data collection server 514. In another embodiment, the universal resource locator can be provided by embedding code in existing webpages that are later transmitted or visited by the web crawler module 510. Next, at process block 706, listing data is collected from the location referenced by the uniform resource locator. In one example, listing data is collected by a directed web crawler that has specific information about the data path of the listing data. Once the listing data is collected, at decision block 708, it is determined whether the listing data is presented in information resource data format. If it is determined that the listing data is presented in information resource data format then at process block 710 the listing data is interpreted according to rules defining the information resource data format. If it is determined that the listing data is not presented according to information resource data format, the process 700 ends at process block 712.


PALM Integration



FIG. 8 illustrates an integrated system 800 for data acquisition and administration in accordance with the present disclosure. In one embodiment, a data acquisition server 836 can house the web crawler module 510 and a web service module 806. The data acquisition server 836 allows the listing service provider 501 to acquire listing data from a lister through various techniques as described above. The data acquisition server 836 can interact with a listing manager 110 that categorizes, filters, cleanses, and in generally maintains the listings located at PALM (platform for advanced listing management) database 814.


In one embodiment, the listing manager utilizes a PALM module 810 to process submitted or acquired listing information. The PALM module 810 and functionalities are described in detail in the U.S. patent application Ser. No. 11/174,393, filed Jun. 30, 2005, and entitled System and Method for Managing Listings, assigned to the assignee of the present application


In addition, the listing manager 110 can include a real time listing module 812 the permits the immediate posting of recently acquired listings at the listing server 112. In one embodiment, the listing server 112 can access a webpages database 808 that stores data for providing listing webpages.


In another embodiment, an application server 832 can be communicated with a listing administration web server 816. The listing administration web server 816 allows listers to check the status of submitted listing data, or listing data that was placed at a website for collection by the listing service provider.


The application server 832 may include modules for administrating listings associated with a lister. For example, a sign-on module 840 that interacts with a user database 822 includes logic to permit a lister to sign-in and gain access to administrative privileges. The application server 832 can further interact with an accounting module 826 that tracks financial gains and other monetary aspects related to the account of the lister. In addition, multiple operational modules can be provided in the application server 832 to allow a user to administrate listings, track performance and return on investment, set-up campaigns, etc.


Communicating with the listing administration server 816 may be an application server 820. The application server 820 includes an account maintenance module 834, a listing administration module 824, a campaign manager 828, and a reporting module 830. The account maintenance module 834 can provide the lister with an interface for viewing, paying or inquiring the latest billing, profile maintenance, set-up multiple accounts, etc.


The listing administration module 824 permits a lister to add, delete, or edit listings. The campaign manager 828 can permit a lister to set-up campaigns for a listing or a group of listings. Finally, the reporting module 830 permits a lister to view the performance of listings, demographics and statistical analysis on how the listings are used, accessed, and treated by users.


Although certain illustrative embodiments and methods have been disclosed herein, it will be apparent form the foregoing disclosure to those skilled in the art that variations and modifications of such embodiments and methods may be made without departing from the true spirit and scope of the art disclosed. Many other examples of the art disclosed exist, each differing from others in matters of detail only. For instance, listing data can be related to listings for the sale or lease or various goods and services. Examples of listing data can include sale or lease of goods such as antiques, collectibles, bikes, boats, books, magazines, clothing, accessories, shoes, computers, electronics, cameras, furniture, related to health care, related to personal care, items for the home, items for the garden, jewelry, watches, movies, music recordings, office items, pet supplies, sports and outdoors items, toys and baby items, video games.


Listing data can also be related to goods and service listings related to automobiles, such as used cars, new cars, certified pre-owned, research services, blue book pricing services, parts and accessories, machinery, tools, etc. Listing data can also be related to pets, such as cats, dogs, horses, birds, and related pet services.


Listing data can also be related to housing services, such as homes for sale, rentals, roommates, find a realtor, today's mortgage rates, find a mover, credit reports. In addition, listing data can be related to tickets for events or traveling such as sports concerts, theater, Broadway, traveling destinations, hotels, airfares, etc.


Listing data can be related to employment such as search jobs, posting a resume, creating job alerts, get career advice, searching by job category, etc. Employment related listing data can also be used in HotJobs as provided by Yahoo Inc.


Listing data can also be listing for services. Listing for wanted services, health care, personal care, computer services, creative, erotic, financial, legal, automotive, lessons, household, moving services, construction services, skilled trade, real estate, therapeutic, etc.


Listing data can also be related to personals ads such as platonic or casual encounters, women seeking women, women seeking men, men seeking women, men seeking men, romantic dinners or dates.


In addition, listings can be presented in the form of banners, images, symbols, etc. Listing can also be hyperlinked to an Internet address. Listings can be presented as symbols, or areas in a map, etc. Furthermore, listing administration provider is any entity having a web site in which a lister user can include a listing, such as an advertisement, so that users visiting the web site of the listing administration provider can select the advertisement and redirected to the lister's web site.


As utilized herein, modules can be separate logical computer processes, separate hardware components, standalone computing devices, etc. Any web interface as provided herein can also be a computer application interface that does not interpret mark-up language but rather communicates directly in order to interface with a server computer.


Furthermore, it will also be apparent to one skilled in the art that the any computer network such as a LAN, WAN, wireless network, etc., can be utilized to implement data acquisition. Accordingly, it is intended that the art disclosed shall be limited only to the extent required by the appended claims and the rules and principles of applicable law. All patents, patent applications and printed publications referred to here are hereby incorporated by reference in their entirety.

Claims
  • 1. A method, comprising: receiving, by a computing device operated by a listing service provider, a network address input by a lister into a web form field displayed by a client computer, wherein the network address is indicative of a location of listing data on a computer network, wherein the listing data comprises at least one information item provided by the lister, wherein the listing data is in a standardized information resource definition language unique to the listing service provider, wherein the information resource definition language is an extension of extensible markup language, and wherein the listing service provider publishes listing data received from a plurality of listers;accessing, by the computing device operated by the listing service provider, the network address received from the client computer, wherein the network address is accessed by opening a computer network connection to retrieve the listing data;storing, by the computing device operated by the listing service provider, the listing data in a listing data database; andposting, by the computing device operated by the listing service provider, the listing data in a search bank hosted by the listing service provider.
  • 2. The method of claim 1, wherein the listing data comprises job listings.
  • 3. The method of claim 1, wherein the listing data comprises real estate listings.
  • 4. The method of claim 1, further comprising analyzing the listing data at the listing data database for conformance to predetermined quality criteria.
  • 5. The method of claim 1, further comprising categorizing the listing data stored in the listing data database into one or more predetermined categories and storing the categorized listing data in the search bank hosted by the listing service provider.
  • 6. The method of claim 1, further comprising transferring selected categorized job information data from the search bank through a job search client server to a job searcher in response to a query by the job searcher.
  • 7. The method of claim 1, further comprising posting the listing data on an Internet website.
  • 8. The method of claim 1, further comprising providing an Internet website wherein a lister can administrate the listing data provided to the listing service provider.
  • 9. The method of claim 1, wherein the information resource definition language comprises a job element.
  • 10. The method of claim 9, wherein the job element comprises an employer company attribute and a job title attribute.
  • 11. The method of claim 1, wherein the information resource definition language comprises a real estate element.
  • 12. The method of claim 11, wherein the real estate element comprises a location attribute and a price attribute.
  • 13. The method of claim 1, wherein the network address is a uniform resource locator.
  • 14. The method of claim 1, wherein the network address is provided by any one of entering a uniform resource locator in an Internet web form hosted by the listing service provider, or transmitting the uniform resource locator as part of a hypertext source code tag.
  • 15. A system, comprising: a server computer operated by a listing service provider;an address receiving module implemented by said server, that receives a network address input by a lister into a web form field provided by the address receiving module, wherein the network address is indicative of a location of listing data on a computer network, wherein the listing data comprises at least one information item provided by the lister, wherein the listing data is in a standardized information resource definition language unique to the listing service provider, wherein the information resource definition language is an extension of extensible markup language, and wherein the listing service provider publishes listing data received from a plurality of listers;a web crawling module implemented by said server, that accesses the network address received from the lister, wherein the network address is accessed by opening a computer network connection to retrieve the listing data, the web crawling module configured to store the listing data in a listing data database; anda search bank hosted by the listing service provider for posting the listing data.
  • 16. The system of claim 15, wherein the listing data comprises job listings.
  • 17. The system of claim 15, wherein the listing data comprises real estate listings.
  • 18. The system of claim 15, further comprising a listing manager that analyzes the listing data at the listing data database for conformance to predetermined quality criteria.
  • 19. The system of claim 15, further comprising a listing manager that categorizes the listing data stored in the listing data database into one or more predetermined categories and stores the categorized listing data in the search bank hosted by the listing service provider.
  • 20. The system of claim 15, further comprising a listing manager that transfers selected categorized job information data from the search bank through a job search client server to a job searcher in response to a query by the job searcher.
  • 21. The system of claim 15, further comprising an Internet website wherein the listing data is posted.
  • 22. The system of claim 15, further comprising an Internet website wherein a lister can administrate the listing data provided to the listing service provider.
  • 23. The system of claim 15, wherein the information resource definition language comprises a job element.
  • 24. The system of claim 23, wherein the job element comprises an employer company attribute and a job title attribute.
  • 25. The system of claim 15, wherein the information resource definition language comprises a real estate element.
  • 26. The system of claim 25, wherein the real estate element comprises a location attribute and a price attribute.
  • 27. The system of claim 15, wherein the network address is a uniform resource locator.
  • 28. The system of claim 15, wherein the network address is provided by any one of entering a uniform resource locator at Internet web form hosted by the listing service provider, or transmitting the uniform resource locator as part of a hypertext source code tag.
  • 29. A computer readable storage medium tangibly encoding a computer program of instructions for executing a computer process for data acquisition, the computer process comprising: receiving a network address input by a lister into a web form field displayed by a client computer, wherein the network address is indicative of a location of listing data on a computer network, wherein the listing data comprises at least one information item provided by the lister, wherein the listing data is in a standardized information resource definition language unique to a listing service provider, wherein the information resource definition language is an extension of extensible markup language, and wherein the listing service provider publishes listing data received from a plurality of listers;accessing the network address received from the client computer, wherein the network address is accessed by opening a computer network connection to retrieve the listing data;storing the listing data in a listing data database; andposting the listing data in a search bank hosted by the listing service provider.
  • 30. The computer readable storage medium of claim 29, wherein the listing data comprises job listings.
  • 31. The computer readable storage medium of claim 29, wherein the listing data comprises real estate listings.
  • 32. The computer readable storage medium of claim 29, further comprising analyzing the listing data at the listing data database for conformance to predetermined quality criteria.
  • 33. The computer readable storage medium of claim 29, further comprising categorizing the listing data stored in the listing data database into one or more predetermined categories and storing the categorized listing data in the search bank hosted by the listing service provider.
  • 34. The computer readable storage medium of claim 29, further comprising transferring selected categorized job information data from the search bank through a job search client server to a job searcher in response to a query by the job searcher.
  • 35. The computer readable storage medium of claim 29, further comprising posting the listing data on an Internet website.
  • 36. The computer readable storage medium of claim 29, further comprising providing an Internet website wherein a lister can administrate the listing data provided to the listing service provider.
  • 37. The computer readable storage medium of claim 29, wherein the information resource definition language comprises a job element.
  • 38. The computer readable storage medium of claim 37, wherein the job element comprises an employer company attribute and a job title attribute.
  • 39. The computer readable storage medium of claim 29, wherein the information resource definition language comprises a real estate element.
  • 40. The computer readable storage medium of claim 39, wherein the real estate element comprises a location attribute and a price attribute.
  • 41. The computer readable storage medium of claim 29, wherein the network address is a uniform resource locator.
  • 42. The computer readable storage medium of claim 29, wherein the network address is provided by any one of entering a uniform resource locator at Internet web form hosted by the listing service provider, or transmitting the uniform resource locator as part of a hypertext source code tag.
RELATED APPLICATIONS

This application claims the benefit of priority of U.S. Provisional Patent Application Ser. No. 60/661,280, filed Mar. 11, 2005. This application is related to U.S. patent application Ser. No. 11/174,393, filed on Jun. 30, 2005, entitled SYSTEM AND METHOD FOR MANAGING LISTINGS. This application is also related to U.S. patent application Ser. No. 11/173,837, filed on Jun. 30, 2005, entitled SYSTEM AND METHOD FOR IMPROVED JOB SEEKING. This application is also related to U.S. patent application Ser. No. 11/173,656, filed on Jun. 30, 2005, entitled SEEKING SYSTEM AND METHOD FOR MANAGING JOB LISTINGS. This application is also related to U.S. patent application Ser. No. 11/173,470, filed on Jun. 30, 2005, entitled JOB CATEGORIZATION SYSTEM AND METHOD. This application is also related to U.S. patent application Ser. No. 11/372,528, entitled SYSTEM AND METHOD FOR LISTING ADMINISTRATION, filed on Mar. 11, 2006. The disclosures of all referenced applications are hereby incorporated by reference in their entirety.

US Referenced Citations (153)
Number Name Date Kind
5062074 Kleinberger et al. Oct 1991 A
5655130 Dodge et al. Aug 1997 A
5671409 Fatseas et al. Sep 1997 A
5805747 Bradford Sep 1998 A
5832497 Taylor Nov 1998 A
5884270 Walker et al. Mar 1999 A
5931907 Davies et al. Aug 1999 A
5978768 McGovern et al. Nov 1999 A
6006225 Bowman et al. Dec 1999 A
6026388 Liddy et al. Feb 2000 A
6052122 Sutcliffe et al. Apr 2000 A
6144944 Kurtzman, II et al. Nov 2000 A
6144958 Ortega Nov 2000 A
6169986 Bowman et al. Jan 2001 B1
6185558 Bowman et al. Feb 2001 B1
6247043 Bates et al. Jun 2001 B1
6263355 Harrell et al. Jul 2001 B1
6304864 Liddy et al. Oct 2001 B1
6363376 Wiens et al. Mar 2002 B1
6370510 McGovern et al. Apr 2002 B1
6401084 Ortega et al. Jun 2002 B1
6434551 Takahashi et al. Aug 2002 B1
6453312 Goiffon et al. Sep 2002 B1
6502065 Imanaka et al. Dec 2002 B2
6516312 Kraft et al. Feb 2003 B1
6564213 Ortega et al. May 2003 B1
6571243 Gupta et al. May 2003 B2
6615209 Gomes et al. Sep 2003 B1
6658423 Pugh et al. Dec 2003 B1
6662194 Joao Dec 2003 B1
6678690 Kobayashi et al. Jan 2004 B2
6681223 Sundaresan Jan 2004 B1
6681247 Payton Jan 2004 B1
6697800 Jannink et al. Feb 2004 B1
6711589 Dietz Mar 2004 B2
6757674 Wiens et al. Jun 2004 B2
6782370 Stack Aug 2004 B1
6853982 Smith et al. Feb 2005 B2
6853993 Ortega Feb 2005 B2
6873996 Chand Mar 2005 B2
6912505 Linden et al. Jun 2005 B2
6963867 Ford et al. Nov 2005 B2
7043433 Hejna May 2006 B2
7043450 Velez et al. May 2006 B2
7076483 Preda et al. Jul 2006 B2
7080057 Scarborough et al. Jul 2006 B2
7089237 Tumbull et al. Aug 2006 B2
7124353 Goodwin et al. Oct 2006 B2
7146416 Yoo et al. Dec 2006 B1
7191176 McCall et al. Mar 2007 B2
7225187 Dumais et al. May 2007 B2
7249121 Bharat et al. Jul 2007 B1
7272610 Torres Sep 2007 B2
7424469 Ratnaparkhi Sep 2008 B2
7809709 Harrison, Jr. Oct 2010 B1
20010042000 Defoor Nov 2001 A1
20010049674 Talib et al. Dec 2001 A1
20020002479 Almog et al. Jan 2002 A1
20020026452 Baumgarten et al. Feb 2002 A1
20020038241 Hiraga Mar 2002 A1
20020072946 Richardson Jun 2002 A1
20020073160 Purcell Jun 2002 A1
20020090688 Kumar Jul 2002 A1
20020091689 Wiens et al. Jul 2002 A1
20020095454 Reed et al. Jul 2002 A1
20020099605 Weitzman et al. Jul 2002 A1
20020111843 Wellenstein et al. Aug 2002 A1
20020120532 McGovern et al. Aug 2002 A1
20020143573 Bryce et al. Oct 2002 A1
20020147701 Chang Oct 2002 A1
20020152051 Fukushige et al. Oct 2002 A1
20020156674 Okamoto et al. Oct 2002 A1
20020194161 McNamee et al. Dec 2002 A1
20020194166 Fowler et al. Dec 2002 A1
20020198882 Linen et al. Dec 2002 A1
20030009437 Seiler et al. Jan 2003 A1
20030014294 Yoneyama et al. Jan 2003 A1
20030014331 Simons Jan 2003 A1
20030018621 Steiner et al. Jan 2003 A1
20030023474 Helweg-Larsen Jan 2003 A1
20030028529 Cheung et al. Feb 2003 A1
20030033292 Meisel et al. Feb 2003 A1
20030046161 Kamangar et al. Mar 2003 A1
20030046311 Baidya et al. Mar 2003 A1
20030046389 Thieme Mar 2003 A1
20030061242 Warmer et al. Mar 2003 A1
20030088465 Monteverde May 2003 A1
20030097357 Ferrari et al. May 2003 A1
20030142128 Reulein et al. Jul 2003 A1
20030171927 Bernard Sep 2003 A1
20030172145 Nguyen Sep 2003 A1
20030182171 Vianello Sep 2003 A1
20030187680 Fujino et al. Oct 2003 A1
20030191684 Lumsden et al. Oct 2003 A1
20030195877 Ford et al. Oct 2003 A1
20030204439 Cullen Oct 2003 A1
20030216930 Dunham et al. Nov 2003 A1
20040064477 Swauger Apr 2004 A1
20040107112 Cotter Jun 2004 A1
20040107123 Haffner et al. Jun 2004 A1
20040107192 Joao Jun 2004 A1
20040117189 Bennett Jun 2004 A1
20040193582 Smyth Sep 2004 A1
20040205002 Layton Oct 2004 A1
20040210565 Lu Oct 2004 A1
20040210600 Chand Oct 2004 A1
20040225629 Eder Nov 2004 A1
20040267549 Anderson et al. Dec 2004 A1
20040267595 Woodings et al. Dec 2004 A1
20040267735 Melham Dec 2004 A1
20050004927 Singer Jan 2005 A1
20050060318 Brickman, Jr. Mar 2005 A1
20050076293 Beresnevichiene Apr 2005 A1
20050080764 Ito Apr 2005 A1
20050083906 Speicher Apr 2005 A1
20050091209 Frank et al. Apr 2005 A1
20050120294 Stefanison et al. Jun 2005 A1
20050125408 Somaroo et al. Jun 2005 A1
20050154699 Lipkin et al. Jul 2005 A1
20050192955 Farrell Sep 2005 A1
20050209955 Underwood et al. Sep 2005 A1
20050210514 Kittlaus et al. Sep 2005 A1
20050216295 Abrahamsohn Sep 2005 A1
20050278709 Sridhar et al. Dec 2005 A1
20060010108 Greenberg Jan 2006 A1
20060100919 Levine May 2006 A1
20060112076 Burris et al. May 2006 A1
20060133595 Ravishankar Jun 2006 A1
20060155698 Vayssiere Jul 2006 A1
20060206448 Hyder Sep 2006 A1
20060206505 Hyder Sep 2006 A1
20060206517 Hyder et al. Sep 2006 A1
20060206584 Hyder et al. Sep 2006 A1
20060212466 Hyder Sep 2006 A1
20060229896 Rosen et al. Oct 2006 A1
20060229899 Hyder et al. Oct 2006 A1
20060242013 Agarwal et al. Oct 2006 A1
20060265267 Chen et al. Nov 2006 A1
20060265268 Hyder et al. Nov 2006 A1
20060265269 Hyder et al. Nov 2006 A1
20060265270 Hyder et al. Nov 2006 A1
20060265352 Chen et al. Nov 2006 A1
20070033064 Abrahamsohn Feb 2007 A1
20070101065 Walker May 2007 A1
20070214140 Dom et al. Sep 2007 A1
20070239777 Toomey Oct 2007 A1
20070273909 Chen et al. Nov 2007 A1
20070288440 Harlow et al. Dec 2007 A1
20080133343 Hyder et al. Jun 2008 A1
20080133499 Hyder et al. Jun 2008 A1
20080183488 Vianello Jul 2008 A1
20090198558 Chen et al. Aug 2009 A1
20090248685 Pasqualoni et al. Oct 2009 A1
Foreign Referenced Citations (5)
Number Date Country
2001134600 May 2001 JP
2002117135 Apr 2002 JP
2002202983 Jul 2002 JP
2003242078 Aug 2003 JP
0146870 Jun 2001 WO
Related Publications (1)
Number Date Country
20060206584 A1 Sep 2006 US
Provisional Applications (1)
Number Date Country
60661280 Mar 2005 US