This application relates generally to printing. The application relates more particularly to use of big data to extract human characteristics of a user-generated print file and recommend additional content associated with the big data.
Document processing devices include printers, copiers, scanners and e-mail gateways. More recently, devices employing two or more of these functions are found in office environments. These devices are referred to as multifunction peripherals (MFPs) or multifunction devices (MFDs). As used herein, MFPs are understood to comprise printers, alone or in combination with other of the afore-noted functions. It is further understood that any suitable document processing device can be used.
Authors of electronic print content will send their print files to an MFP which prints the document to display the content as it was received.
Various embodiments will become better understood with regard to the following description, appended claims and accompanying drawings wherein:
The systems and methods disclosed herein are described in detail by way of examples and with reference to the figures. It will be appreciated that modifications to disclosed and described examples, arrangements, configurations, components, elements, apparatuses, devices methods, systems, etc. can suitably be made and may be desired for a specific application. In this disclosure, any identification of specific techniques, arrangements, etc. are either related to a specific example presented or are merely a general description of such a technique, arrangement, etc. Identifications of specific details or examples are not intended to be, and should not be, construed as mandatory or limiting unless specifically designated as such.
A relational database is a type of database that stores and provides access to data points that are related to one another. Relational databases are based on the relational model, an intuitive, straightforward way of representing data in tables. In a relational database, each row in the table is a record with a unique ID called the key. The columns of the table hold attributes of the data, and each record usually has a value for each attribute, making it easy to establish the relationships among data points.
A relational database organizes data into tables which can be linked, or related, based on data common to each. This capability enables one to retrieve an entirely new table from data in one or more tables with a single query.
Relational databases are comprised of columns and rows. A column is a set of data values of a particular type, one value for each row of the database. A column may contain text values, numbers, or pointers to files in an operating system. Columns may comprise simple or more complex data types, such as whole documents, images, or multimedia, such as sound or video clips. A column can also be called an attribute. Each row provides a data value for each column and forms a single structured data value. For example, a database that represents company contact information might have the following columns: ID, Company Name, Address Line 1, Address Line 2, City, and Postal Code. More formally, a row is a tuple containing a specific value for each column, for example: (1234, ‘Big Company Inc.’, ‘123 East Example Street’, ‘456 West Example Drive’, ‘Big City’, 98765). The word ‘field’ is normally used interchangeably with ‘column’.
Big data uses relational databases to analyze, systematically extract information from, or otherwise deal with data sets that are too large or complex to be dealt with by traditional data-processing application software. Data with many fields or columns offer greater statistical power, while data with higher complexity (more attributes or columns) may lead to a higher false discovery rate. Big data analysis challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating, information privacy, and data source.
Big data can use predictive analytics, user behavior analytics or any suitable data analytics method to extract value from big data.
Relational database management systems and desktop statistical software packages used to visualize data often have difficulty processing and analyzing big data. Processing and analysis of big data may require software running in multiple servers.
As used herein Big Data is an analytics dataset comprised of information that users are continuously generating both online and offline. Big Data targets understanding of MFP users. It includes what users have scanned or printed, and may also include such information as their age, gender, occupation, location, travel history, social media activity or favorite articles.
By way of example, one may wish to create a marketing strategy for potential investors. This can lead a company improved profitability. The challenge is how to create better content to deliver improved results. A team of talented writers and designers may increase a likelihood of success, but this may not be optimal. Sometimes a single precisely targeted word can make the difference between a good marketing strategy and a better, or best, marketing strategy, increasing the probability that the strategy will be effective against your competitors.
In example embodiment herein, a user has already written a document and wants to print it. During the print submission process, the system will use Big Data, search for better content and provide the user with suggestions to modify the document before printing.
Information can be used, such as how many similar articles people find. Thousands of new posts are published every day. Many of these posts many provide content that people like. If an article contains a lot of meaningful content and most people like it, the system will improve its ranking and suggest that content to users. The system can learn and adapt to new data without human intervention. A user will typically write a document using text or characters. Here are some examples wherein recommended content, such as additional or corrected information, is generated from extracted text from a print file.
Text: “Michael Jeffrey Jordan was born on Feb. 17, 1964”
Issue: Incorrect information.
Recommendation: Michael Jeffrey Jordan was born on Feb. 17, 1963
Text: “In his article Stanley Fish shows that we don't really have the right to free speech.”
Issue: A thesis takes a position on an issue.
Recommendation: Stanley Fish's argument that free speech exists more as a political prize than as a legal reality ignores the fact that even as a political prize it still serves the social end of creating a general cultural atmosphere of tolerance that may ultimately promote free speech in our nation just as effectively as any binding law.
Text: “The government has the right to limit free speech.”
Issue: A thesis should be as specific as possible, and it should be tailored to reflect the scope of the paper.
Recommendation: The government has the right to limit free speech in cases of overtly racist or sexist language because our failure to address such abuses would effectively suggest that our society condones such ignorant and hateful views.
Text: “Although we have the right to say what we want, we should avoid hurting other people's feelings.”
Issue: A thesis must be arguable.
Recommendation: If we can accept that emotional injuries can be just as painful as physical ones we should limit speech that may hurt people's feelings in ways similar to the way we limit speech that may lead directly to bodily harm.
Text: “There are many reasons we need to limit hate speech.”
Issue: A good argumentative thesis provides not only a position on an issue, but also suggests the structure of the paper.
Recommendation: Among the many reasons we need to limit hate speech the most compelling ones all refer to our history of discrimination and prejudice, and it is, ultimately, for the purpose of trying to repair our troubled racial society that we need hate speech legislation.
Text: “Hate speech can cause emotional pain and suffering in victims just as intense as physical battery.”
Issue: a thesis statement that makes a factual claim that can be verified only with scientific, sociological, psychological, or other kind of experimental evidence is not appropriate.
Recommendation: The various arguments against the regulation of hate speech depend on the unspoken and unexamined assumption that emotional pain is either trivial.
Accordingly, when users print a document, they receive recommendations about similar and better content that may be of interest to them. This can be achieved, for example, by a system that considers an audience's demographics or observed behavior. After the system has found information that fits the user's document, the system rewrites or rephrases it serving to alleviate plagiarism. Reference a page or pages of works relied upon are suitably cited at the end of any document. The system continuously invests effort to understand the users, supply them with meaningful content and then measure a success of recommendation and determine which content performs better from the rest. This may further include information as to whether the user ultimately adopts some or all of suggested modified content in their ultimate printout.
In the illustrated example, a user 104 wishes to scan or print a document on MFP 108. Scanning of scan document 112 generates a scan file which may be subject to optical character recognition. The user may also send a print job by uploading it to MFP 108 directly, or via a digital user device such as workstation 116 or smartphone 120. A print file or scan file is sent to an artificial intelligence/machine learning server 124 via network cloud 102. Server 124 is provided with Big Data on any suitable platform. Machine learning or artificial intelligence applications can be implemented on any suitable platform such as Microsoft's AZURE. Alternatives, by way of example, include platforms INZATA, ANSWEROCKET, SEEBO, and others.
Server 124 secures Big Data from sources such as Internet sources including social media posts, press releases, call center logs, customer feedback, third party data, consumer sentiment information, transaction logs, or the like. Big Data is also sourced from MFP information, including content of print files and scan files. Server 124 applies Big Data to a received print file or scan file, and determines human characteristics of an author of the print file. Server 124 then outputs suggestions for modification of the original document, including corrections, additions or deletions which are relayed to user 104 such as by display on MFP touchscreen 128 of MFP 108, on workstation 116 or smartphone 120. Suggestions may also be added to the user's electronic document so as to be displayed in context. Such suggestions may be also be viewed by printing the suggestions or the annotated document. As noted above, source are suitably included in the annotations or as an attachment.
User 104 is provided with an ability to accept, reject or modify any generated suggestions. Once finalized, the final document is again sent to server 124 for further analysis and refinement of Big Data in accordance with user input. Modified document 132 is then printed. Accordingly, a user need only send a file for printing and the system works from there.
Turning now to
Processor 202 is also in data communication with a storage interface 208 for reading or writing to a storage 216, suitably comprised of a hard disk, optical disk, solid-state disk, cloud-based storage, or any other suitable data storage as will be appreciated by one of ordinary skill in the art.
Processor 202 is also in data communication with a network interface 210 which provides an interface to a network interface controller (NIC) 214, which in turn provides a data path to any suitable wired interface or physical network connection 220, or to a wireless data connection via wireless network interface 218. Example wireless data connections include cellular, Wi-Fi, Bluetooth, NFC, wireless universal serial bus (wireless USB), satellite, and the like. Example wired interfaces include Ethernet, USB, IEEE 1394 (FireWire), Lightning, telephone line, or the like.
Processor 202 can also be in data communication with any suitable user input/output (I/O) interface 219 which provides data communication for interfacing with user peripherals, such as displays, keyboards, mice, track balls, touch screens, or the like. Processor 202 can also be in communication with hardware monitor 221, such as a page counter, temperature sensor, toner or ink level sensor, paper level sensor, or the like.
Also in data communication with data bus 212 is a document processor interface 222 suitable for data communication with the document rendering system 200, including MFP functional units. In the illustrated example, these units include copy hardware 240, scan hardware 242, print hardware 244 and fax hardware 246 which together comprise MFP functional hardware 250. It will be understood that functional units are suitably comprised of intelligent units, including any suitable hardware or software platform.
Turning now to
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel embodiments described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the embodiments described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the spirit and scope of the inventions.