This application relates to content management, and more particularly to automatic annotation of content at varying levels of detail.
The proliferation of Internet hosted content has been a boon to academia, businesses, and consumers alike. Opinions, research articles, books, photographs, and video are just some of the content available to be viewed both privately and publicly through the Internet. Along with the growth in available content, there has been a similar growth in the types of devices that can be used to access that content. Computers, tablets, e-readers, and smart phones are just some of the categories of devices available to consumers and businesses to access content.
As the type of devices that can access content has grown, the capabilities of the devices have become segmented. For example, devices can have a color screen or a black and white screen, devices can have varying resolutions, devices can have varying screen sizes, devices can have varying processing power, etc. The varying capabilities of devices can present challenges in the consumption of content. For example, the user of a device, such as a desktop computer with a large monitor, may desire to view a long detailed research article in its entirety. To the contrary, a user of a smart phone with a three inch screen with limited screen resolution may instead only desire to see a brief abstract summarizing the article.
While the original author or creator of the content can create differing versions of the content, this relies on all authors to be good Samaritans to be useful on a grander scale. For the avoidance of doubt, the above-described contextual background shall not be considered limiting on any of the below-described embodiments, as described in more detail below.
The following presents a simplified summary of the specification in order to provide a basic understanding of some aspects of the specification. This summary is not an extensive overview of the specification. It is intended to neither identify key or critical elements of the specification nor delineate the scope of any particular embodiments of the specification, or any scope of the claims. Its sole purpose is to present some concepts of the specification in a simplified form as a prelude to the more detailed description that is presented in this disclosure.
Systems and methods disclosed herein relate to automatic annotation of content. An input component can receive content, wherein the content is at least partly textual content. An auto annotation component can generate differing sets of the content wherein a set among the sets of content is associated with a level of detail. An output component can send at least one set among the sets of the content to a content browser based on a specified level of detail.
In another embodiment, at least partly textual content can be received. Non-textual content associated with the at least partly textual content can be identified. Differing set of the content can be generated where the differing sets are associated with a textual level of detail and a non-textual level of detail. A subset of the differing set of content can be sent to a content browser based on a requested level of detail.
The following description and the drawings set forth certain illustrative aspects of the specification. These aspects are indicative, however, of but a few of the various ways in which the principles of the specification may be employed. Other advantages and novel features of the specification will become apparent from the following detailed description of the specification when considered in conjunction with the drawings.
The various embodiments are now described with reference to the drawings, wherein like reference numerals are used to refer to like elements throughout. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the various embodiments. It may be evident, however, that the various embodiments can be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to facilitate describing the various embodiments.
Systems and methods disclosed herein provide for auto annotation of content. The system provides for automatically creating different levels of abstraction of content where it was not previously available or explicitly provided. Content that is at least partly textual can be analyzed based on a combination of semantic features to determine key words, phrases, sentences, etc. that best represent a shorter version or versions of the textual content. It can be appreciated that through auto annotation, shorter versions of textual content like an article, a book, a description of associated non-textual content, etc. can relay key concepts or conclusions from the text in a smaller format that is more desirable for a user, or more easily consumed on a specific user device.
Referring now to
Morphological features can then be identified for each word in the set of words. Morphological features can include a part of speech, a gender, a case, a number, a date or a proper noun. For example, starting with the first word in the set of words, Alexandria can be identified as a noun that is capitalized. As “Alexandria” is the first word in the sentence, it is unclear during morphological analysis whether it is a proper noun or merely the first word in a sentence that is capitalized. Morphological analysis can proceed with every word in
It can be appreciated that during a morphological analysis, a word dictionary, a phrase dictionary, a person data store, a company data store, or a location data store can be used in determining morphological features associated with a word. For example, the word “Alexandria” can be identified as both a name and a location, for example, Alexandria, Va. or Alexandria, Egypt.
Semantic analysis can follow parsing, and can be based off updated morphological features associated with the sets of words and sets of sentences. Semantic analysis provides for construction grade wood ties of words within a sentence, identifying the words and/or phrases necessary for “meaning ” In effect, semantic analysis is the extraction of meaning from the text. Using the set of words identified in
Referring now to
Referring now to
A content browser 320, such as an Internet browser or application can use the varying levels of content to select the most appropriate content based on user preferences or device capabilities. For example, screen size or resolution of a device can limit the types of non-textual content capable of being displayed. In addition, screen size and resolution can limit the amount of text comfortably being viewed on the screen. Certain users may also have content preferences unrelated to device capabilities. It can be appreciated that content browser 320 is capable of selecting multiple levels, a single level or no levels from the varying levels of content.
Similar to the varying levels of textual content 312, varying levels of video content 314 can also be auto-annotated at 310. For example, varying levels of video content 314 can include selecting and playing a smaller percentage of the total video, adjusting a video compression codec, adjusting a video size, etc. It can be appreciated that varying levels of video content can also include a level that completely eliminates video content from being viewed by content browser 320. For example, a user may not desire to have video content be displayed on their respective content browser, or alternatively, a device using content browser 320 may not be capable of displaying certain video content. It can also be appreciated that video compression and video size can be adjusted depending on a device that is seeking to access the content via content browser 320.
Varying levels of audio content 316 can also be auto annotated at 310. For example, audio compression, audio size, or audio length can all be adjusted to provide the varying levels of audio content 316. In one example, audio can focus on a specific speaker where other sections of the audio related to other speakers can be removed. For example, using audio of a speech from a mayor as shown in
Varying levels of image content 318 can also be auto annotated at 310. Image compression, image size, or a number of images can all be annotated to provide varying levels of image content.
Referring now to
An auto annotation component 420 can automatically generate differing sets of the content in response to reception of the content wherein a set of the sets of the content is associated with a level of detail of a set of different levels of detail. It can be appreciated that the level of detail can include separate levels of detail associated with textual content, video content, audio content, image content, etc. Differing sets of content can include varying levels of annotation that retain varying amounts of the original content or supplements portions of the original content with new content that better summarizes the meaning of original content. Sets of content 404 can be stored within memory 402. It can be appreciated that memory 402 can be disparately located from network service 400.
Output component 430 can send at least one set among the sets of the content to a content browser 320 based on a specified level of detail. In one embodiment, output component 430 can send at least one set among the sets of the content to a content browser further based at least one of a user level of detail selection, a hardware profile, or a content browser setting
Referring now to
Referring now to
Referring now to
Referring now to
Referring now to
Referring now to
At 1112, in response to the receiving or the determining, differing sets of content can be generated (e.g., by an auto annotation component) based at least in part on the morphological features for each in the set of words, wherein a set of content among the sets of the content is associated with a textual level of detail and a non-textual level of detail. At 1114, at least one set of content among the sets of content can be sent (e.g., by an output component) to a content browser based on a desired level of detail.
At 1218, in response to the receiving or the determining, differing sets of content can be automatically generated (e.g., by an auto annotation component) based at least in part on the extracted meaning, wherein a set of content among the sets of the content is associated with a textual level of detail and a non-textual level of detail. At 1220, at least one set of content among the sets of content can be sent (e.g., by an output component) to a content browser based on a desired level of detail.
With reference to
The system bus 1308 can be any of several types of bus structure(s) including the memory bus or memory controller, a peripheral bus or external bus, and/or a local bus using any variety of available bus architectures including, but not limited to, Industrial Standard Architecture (ISA), Micro-Channel Architecture (MSA), Extended ISA (EISA), Intelligent Drive Electronics (IDE), VESA Local Bus (VLB), Peripheral Component Interconnect (PCI), Card Bus, Universal Serial Bus (USB), Advanced Graphics Port (AGP), Personal Computer Memory Card International Association bus (PCMCIA), Firewire (IEEE 1394), and Small Computer Systems Interface (SCSI).
The system memory 1306 includes volatile memory 1310 and non-volatile memory 1312. The basic input/output system (BIOS), containing the basic routines to transfer information between elements within the computer 1302, such as during start-up, is stored in non-volatile memory 1312. By way of illustration, and not limitation, non-volatile memory 1312 can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory 1310 includes random access memory (RAM), which acts as external cache memory. According to present aspects, the volatile memory may store the write operation retry logic (not shown in
Computer 1302 may also include removable/non-removable, volatile/non-volatile computer storage media.
It is to be appreciated that
A user enters commands or information into the computer 1302 through input device(s) 1328. Input devices 1328 include, but are not limited to, a pointing device such as a mouse, trackball, stylus, touch pad, keyboard, microphone, joystick, game pad, satellite dish, scanner, TV tuner card, digital camera, digital video camera, web camera, and the like. These and other input devices connect to the processing unit 1304 through the system bus 1308 via interface port(s) 1330. Interface port(s) 1330 include, for example, a serial port, a parallel port, a game port, and a universal serial bus (USB). Output device(s) 1336 use some of the same type of ports as input device(s) 1328. Thus, for example, a USB port may be used to provide input to computer 1302, and to output information from computer 1302 to an output device 1336. Output adapter 1334 is provided to illustrate that there are some output devices 1336 like monitors, speakers, and printers, among other output devices 1336, which require special adapters. The output adapters 1334 include, by way of illustration and not limitation, video and sound cards that provide a means of connection between the output device 1336 and the system bus 1308. It should be noted that other devices and/or systems of devices provide both input and output capabilities such as remote computer(s) 1338.
Computer 1302 can operate in a networked environment using logical connections to one or more remote computers, such as remote computer(s) 1338. The remote computer(s) 1338 can be a personal computer, a bank server, a bank client, a bank processing center, a certificate authority, a router, a network PC, a workstation, a microprocessor based appliance, a peer device, a smart phone, a tablet, or other network node, and typically includes many of the elements described relative to computer 1302. For purposes of brevity, only a memory storage device 1340 is illustrated with remote computer(s) 1338. Remote computer(s) 1338 is logically connected to computer 1302 through a network interface 1342 and then connected via communication connection(s) 1344. Network interface 1342 encompasses wire and/or wireless communication networks such as local-area networks (LAN) and wide-area networks (WAN) and cellular networks. LAN technologies include Fiber Distributed Data Interface (FDDI), Copper Distributed Data Interface (CDDI), Ethernet, Token Ring and the like. WAN technologies include, but are not limited to, point-to-point links, circuit switching networks like Integrated Services Digital Networks (ISDN) and variations thereon, packet switching networks, and Digital Subscriber Lines (DSL).
Communication connection(s) 1344 refers to the hardware/software employed to connect the network interface 1342 to the bus 1308. While communication connection 1344 is shown for illustrative clarity inside computer 1302, it can also be external to computer 1302. The hardware/software necessary for connection to the network interface 1342 includes, for exemplary purposes only, internal and external technologies such as, modems including regular telephone grade modems, cable modems and DSL modems, ISDN adapters, and wired and wireless Ethernet cards, hubs, and routers.
Referring now to
The system 1400 also includes one or more server(s) 1404. The server(s) 1404 can also be hardware or hardware in combination with software (e.g., threads, processes, computing devices). The servers 1404 can house threads to perform, for example, identifying morphological features, extracting meaning, auto annotating content, etc. One possible communication between a client 1402 and a server 1404 can be in the form of a data packet adapted to be transmitted between two or more computer processes where the data packet contains, for example, a certificate. The data packet can include a cookie and/or associated contextual information, for example. The system 1400 includes a communication framework 1406 (e.g., a global communication network such as the Internet) that can be employed to facilitate communications between the client(s) 1402 and the server(s) 1404.
Communications can be facilitated via a wired (including optical fiber) and/or wireless technology. The client(s) 1402 are operatively connected to one or more client data store(s) 1408 that can be employed to store information local to the client(s) 1402 (e.g., cookie(s) and/or associated contextual information). Similarly, the server(s) 1404 are operatively connected to one or more server data store(s) 1410 that can be employed to store information local to the servers 1404.
The illustrated aspects of the disclosure may also be practiced in distributed computing environments where certain tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules can be located in both local and remote memory storage devices.
The processes described above can be embodied within hardware, such as a single integrated circuit (IC) chip, multiple ICs, an application specific integrated circuit (ASIC), or the like. Further, the order in which some or all of the process blocks appear in each process should not be deemed limiting. Rather, it should be understood that some of the process blocks can be executed in a variety of orders that are not all of which may be explicitly illustrated herein.
What has been described above includes examples of the implementations of the present invention. It is, of course, not possible to describe every conceivable combination of components or methods for purposes of describing the claimed subject matter, but many further combinations and permutations of the subject embodiments are possible. Accordingly, the claimed subject matter is intended to embrace all such alterations, modifications, and variations that fall within the spirit and scope of the appended claims. Moreover, the above description of illustrated implementations of this disclosure, including what is described in the Abstract, is not intended to be exhaustive or to limit the disclosed implementations to the precise forms disclosed. While specific implementations and examples are described herein for illustrative purposes, various modifications are possible that are considered within the scope of such implementations and examples, as those skilled in the relevant art can recognize
In particular and in regard to the various functions performed by the above described components, devices, circuits, systems and the like, the terms used to describe such components are intended to correspond, unless otherwise indicated, to any component which performs the specified function of the described component (e.g., a functional equivalent), even though not structurally equivalent to the disclosed structure, which performs the function in the herein illustrated exemplary aspects of the claimed subject matter. In this regard, it will also be recognized that the various embodiments includes a system as well as a computer-readable storage medium having computer-executable instructions for performing the acts and/or events of the various methods of the claimed subject matter.