Ex Parte Cao et al

Patent Trials and Appeals BoardMar 27, 2019

13267738 - (D) (P.T.A.B. Mar. 27, 2019)

UNITED STA TES p A TENT AND TRADEMARK OFFICE APPLICATION NO. FILING DATE FIRST NAMED INVENTOR 13/267,738 10/06/2011 Xiang Cao 150004 7590 03/29/2019 DENTONS US LLP - Apple 4655 Executive Dr Suite 700 San Diego, CA 92121 UNITED STATES DEPARTMENT OF COMMERCE United States Patent and Trademark Office Address: COMMISSIONER FOR PATENTS P.O. Box 1450 Alexandria, Virginia 22313-1450 www .uspto.gov ATTORNEY DOCKET NO. CONFIRMATION NO. Pl 0853US 1/77870000120101 3562 EXAMINER ADESANYA, OLUJIMI A ART UNIT PAPER NUMBER 2658 NOTIFICATION DATE DELIVERY MODE 03/29/2019 ELECTRONIC Please find below and/or attached an Office communication concerning this application or proceeding. The time period for reply, if any, is set in the attached communication. Notice of the Office communication was sent electronically on above-indicated "Notification Date" to the following e-mail address(es): patents.us@dentons.com dentons_PAIR@firsttofile.com PTOL-90A (Rev. 04/07) UNITED STATES PATENT AND TRADEMARK OFFICE BEFORE THE PATENT TRIAL AND APPEAL BOARD Ex parte XIANG CAO, ALAN C. CANNISTRARO, GREGORY S. ROBBIN, and CASEY M. DOUGHERTY Appeal2017-007625 Application 13/267,738 1 Technology Center 2600 Before MAHSHID D. SAADAT, ST. JOHN COURTENAY III, and MICHAEL J. STRAUSS, Administrative Patent Judges. COURTENAY, Administrative Patent Judge. DECISION ON APPEAL STATEMENT OF THE CASE This is an appeal under 35 U.S.C. Â§ 134(a) from the Examiner's Final Rejection of claims 1-3, 6-18, and 21-30. Claims 4, 5, 19, 20, and 31-34 are cancelled. We have jurisdiction under 35 U.S.C. Â§ 6(b ). We affirm. Invention Embodiments of Appellants' claimed invention relate to "automatically creating a mapping between text data and audio data by 1 The real party in interest is Apple Inc. App. Br. 4. Appeal2017-007625 Application 13/267,738 analyzing the audio data to detect words reflected therein and compare those words to words in the document." Spec. ,r 4. Representative Independent Method Claim 1 1. A method comprising: receiving audio data that corresponds to at least a portion of a work, wherein at least a portion of a textual version of the work is displayed [L 1] performing a speech-to-text analysis of the audio data to generate text for portions of the audio data, wherein the speech-to-text analysis employs a sliding window and wherein a set of words into which the audio data can be translated is limited to words in the sliding window; and [L2] based on the text generated for the portions of the audio data, generating a mapping between a plurality of audio locations in the audio data and a corresponding plurality of text locations in the textual version of the work; wherein the method is performed by one or more computing devices. (Emphasis added regarding the contested limitations LI and L2.) Representative Independent Method Claim 14 14. A method comprising: receiving a textual version of a work; performing a text-to-speech analysis of the textual version to generate first audio data; based on the first audio data and the textual version, generating a first mapping record correlating a first plurality of 2 Appeal2017-007625 Application 13/267,738 audio locations in the first audio data and a corresponding plurality of text locations in the textual version of the work; receiving second audio data that reflects an audible version of the work for which the textual version exists; and based on [LI] (1) a comparison of the first audio data with the second audio data and [L2] (2) the first mapping record, generating a second mapping record correlating a second plurality of audio locations in the second audio data and the plurality of text locations in the textual version of the work; wherein the method is performed by one or more computing devices. (Emphasis added regarding the contested limitations LI and L2.) Rejections2 A. Claims 1-3, 6, 15-18, 21, and 30 are rejected underpre-AIA 35 U.S.C. Â§ 103(a) as being obvious over the combined teachings and suggestions of Beattie et al. (US 2007/0055514 Al, pub. Mar. 8, 2007) ("Beattie") in view of Mayer (US 6,282,511 B 1, issued Aug. 28, 2001) ("Mayer"). B. Claims 14 and 29 are rejected under pre-AIA 35 U.S.C. Â§ I03(a) as being obvious over the combined teachings and suggestions of Adams, Jr. et al. (US 6,017,219; issued Jan. 25, 2000) ("Adams") in view of Heckerman et al. (US 6,260,011 B 1; issued July 10, 2001) ("Heckerman"). 2 We note the rejection under 35 U.S.C. Â§ 101 was withdrawn by the Examiner. See Ans. 2. Therefore, it is not before us on appeal. 3 Appeal2017-007625 Application 13/267,738 C. Claims 7-13 and 22-28 are rejected under pre-AIA 35 U.S.C. Â§ 103(a) as being obvious over the combined teachings and suggestions of Beattie, Mayer, and Heckerman. ANALYSIS We have considered all of Appellants' arguments and any evidence presented. We have reviewed Appellants' arguments in the Briefs, the Examiner's obviousness rejections, and the Examiner's responses to Appellants' arguments. Appellants do not proffer sufficient argument or evidence to persuade us of error regarding the Examiner's underlying factual findings and ultimate legal conclusion of obviousness. See Ex parte Frye, 94 USPQ2d 1072, 1075 (BPAI 2010) (precedential) ("The panel ... reviews the obviousness rejection for error based upon the issues identified by appellant, and in light of the arguments and evidence produced thereon."). For at least the reasons discussed below, we agree with and adopt the Examiner's underlying factual findings and ultimate legal conclusion of obviousness, as set forth in the Final Action and Answer. In our analysis below, we highlight and address specific findings and arguments for emphasis. Grouping of Claims Based upon Appellants' arguments (App. Br. 11-26), and our discretion under 37 C.F.R. Â§ 4I.37(c)(l)(iv), we decide the appeal of the Rejection A of claims 1, 2, 6, 15-17, 21, and 30 on the basis of representative claim 1. Because Appellants separately argue dependent claim 3, we decide the appeal of Rejection A of claim 3, and Rejection A of claim 18 ( which depends upon claim 3) on the basis of representative claim 3. See App. Br. 26-29. 4 Appeal2017-007625 Application 13/267,738 Also based upon Appellants' arguments (App. Br. 29-41), and our discretion under 37 C.F.R. Â§ 4I.37(c)(l)(iv), we decide the appeal of Rejection B of independent claim 14 and dependent claim 29 on the basis of representative claim 14. We separately address infra the remaining dependent claims 7-13 and 22-28, as rejected under obviousness Rejection C. Contested Limitations LI and L2 of Independent Claim I under Rejection A Appellants contest the Examiner's findings regarding the following limitations (LI and L2), as recited in representative independent claim 1: [L 1] peiforming a speech-to-text analysis of the audio data to generate text for portions of the audio data, wherein the speech-to-text analysis employs a sliding window and wherein a set of words into which the audio data can be translated is limited to words in the sliding window; and [L2] based on the text generated for the portions of the audio data, generating a mapping between a plurality of audio locations in the audio data and a corresponding plurality of text locations in the textual version of the work[.] Claim 1 ( emphasis added); see App. Br. 11-26. 3 Contested Limitation LI of Claim I under Rejection A Appellants particularly dispute the Examiner's findings regarding the wherein clause of limitation L 1 of claim 1: "wherein a set of words into 3 We give the contested claim limitations the broadest reasonable interpretation ("BRI") consistent with the Specification. See In re Morris, 127 F.3d 1048, 1054 (Fed. Cir. 1997). 5 Appeal2017-007625 Application 13/267,738 which the audio data can be translated is limited to words in the sliding window." See App. Br. 13-18. Appellants note that in an exemplary embodiment, "as the system performs speech-to-text analysis of audio data ( e.g., an audio book), the system tracks a 'current translation position' relative to corresponding text data ( e.g., an e-book)." App. Br. 12 ( citing Spec. ,r 44). Appellants explain: "As the speech-to-text analysis progresses, the system moves a sliding window across the text data based on the current translation position, and restricts the candidate words of the speech-to-text analysis to the words in the sliding window." App. Br. 12 ( emphasis added) ( citing Spec. ,r,r 44--49). The Examiner finds limitation L 1 of claim 1 is taught or suggested principally by Mayer, at column 4, lines 47-57, and Figure 7. Final Act. 4. We reproduce the cited portion of Mayer in context below: The speech recognition function for the system of the invention is particularly easy to implement because the speech recognizer generally needs only be able to recognize a small vocabulary of words at any given point in time-the vocabulary ofhyperlinkwords and action words. To aid recognizer performance, a sliding window of hyperlink words may be used to define the recognizer vocabulary, so that, at any given point in time, that vocabulary would include the most recently played hyperlink word and some number of hyperlink words enunciated earlier (but, in general, less than the total of all previously played links). Mayer, col. 4, 11. 46-65 ( emphasis added). Appellants contend: Notably, the above description of Mayer is consistent with the portions of Mayer cited by the Examiner. For example, at column 4, lines 49-51, which the Examiner cites and Appellants reproduce above, Mayer states that "the speech recognizer generally needs only be able to recognize a small 6 Appeal2017-007625 Application 13/267,738 vocabulary of words at any given point in time-the vocabulary ofhyperlink words and action words." That is, the speech recognizer recognizes hyperlink words of the "sliding window" as well as the action words. Appellants emphasize that the plain meaning of the term "limit" is "to prevent (something) from being larger, longer, more, etc." In other words, the plain language of the claim requires that "the set of words into which the audio can be translated" include no more than words in the sliding window. Because Mayer describes a recognizer vocabulary that includes more than words in a sliding window of hyperlink words, Mayer fails to teach "wherein the speech-to-text analysis employs a sliding window and wherein a set of words into which the audio data can be translated is limited to words in the sliding window" as recited in claim 1. App. Br. 17 (footnotes and emphasis omitted). In response to Appellants' argument that "the plain language of the claim requires that 'the set of words into which the audio can be translated' include[s] no more than words in the sliding window" (id.), we particularly note that claim 1 does not define or otherwise limit how the words in the sliding window are selected for inclusion in the sliding window. Although claim 1 requires "the speech-to-text analysis employs a sliding window" and the "set of words into which the audio data can be translated is limited to words in the sliding window," we find no language that limits or otherwise restricts how words are selected for inclusion in the set of words in the sliding window."4 (emphasis added). 4 We note the scope of the claims on appeal, at a minimum, covers the corresponding supporting embodiment(s) described in the Specification. We emphasize, however, that under a broad but reasonable interpretation of the claims in light of the Specification, the scope of the claims is not limited to the preferred embodiments described in Appellants' Specification. Even 7 Appeal2017-007625 Application 13/267,738 When we look to the Specification for context, we find paragraphs 4 7--49 merely describe non-limiting, exemplary embodiments of the "sliding window" recited in claim 1. One of those embodiments is described as follows: While a specific example was given above, the window may span any amount of text within the textual version of the work. ... For example, the textual version of a work may comprise a page indicator ( e.g., in the form of an HTML or XML tag) that indicates, within the content of the textual version of the work, the beginning of a page or the ending of a page. Spec. ,r 48 ( emphasis added). Because the hyperlink words included in Mayer's sliding window (column 4, line 60) are also HTML tags, we find the Examiner's reading of the claim term "sliding window" on Mayer's "sliding window" that includes hyperlinks (col. 4, 11. 52-53) is fully consistent with Appellants' description in the Specification (i-f 48). See Mayer col. 4, 11. 52-53 ("a sliding window of hyperlink words may be used to define the recognizer vocabulary" ( emphasis added)). Therefore, the premise for Appellants' argument that Mayer describes a recognizer vocabulary that includes different or additional words in a sliding window (e.g., the vocabulary ofhyperlink and action words- col. 4, 11. 50-51) is immaterial to our analysis, because Appellants' argument is not under a narrower Phillips claim construction (as construed in federal court), our reviewing court guides: "[ A ]lthough the specification often describes very specific embodiments of the invention, we have repeatedly warned against confining the claims to those embodiments. [C]laims may embrace 'different subject matter than is illustrated in the specific embodiments in the specification."' Phillips v. AWH Corp., 415 F.3d 1303, 1323 (Fed. Cir. 2005) ( en bane) (internal citations omitted). 8 Appeal2017-007625 Application 13/267,738 commensurate with the scope of claim 1 under a broad but reasonable interpretation. See supra, n.3. Because we find no claim language which limits or otherwise restricts how the words in the "sliding window" are selected for inclusion in the set of words in the "sliding window," we find Appellants' arguments unavailing regarding contested limitation L 1 of claim 1. Contested Limitation L2 of Claim 1 under Rejection A Regarding contested limitation L2 of claim 1, Appellants contend: Neither Beattie nor Mayer discloses or suggests 'based on the text generated for the portions of the audio data, generating a mapping between a plurality of audio locations in the audio data and a corresponding plurality of text locations in the textual version of the work' as recited in claim 1. App. Br. 18 (boldface omitted). The Examiner finds contested limitation L2 of claim 1 is principally taught by Beattie at paragraphs 46 and 49. Final Act. 3--4. The Examiner explains the basis for the rejection, as follows: The sentence-by-sentence tracking provides 84 a visual indication (e.g., changes the color of the words, italicizes, etc.) for an entire sentence to be read by the user ...... The user reads the visually indicated portion and the system receives 86 the audio input. The system determines 88 if a correct reading of the indicated portion has been received. The portion remains visually indicated 90 until the speech recognition obtains an acceptable recognition from the user .... , para. [0046]; para. [0049], highlighting/color differentiating text upon pronunciation as providing mapping between audio and text[.] Ans. 3--4 ( emphasis omitted). Appellants note "Beattie is directed to 'tutoring software' that guides a user through an oral reading of a displayed passage." App. Br. 20 (citing 9 Appeal2017-007625 Application 13/267,738 Beattie ,r 41 ). Appellants explain: "Beattie' s system can visually indicate a sentence to be read by the user, perform speech recognition on an audio input from the user, and, based on the speech recognition analysis, verify whether the user's oral reading matches the sentence." Id. Appellants urge: Beattie' s process at most involves performing speech recognition on the user's audio input to determine if the user's reading matches displayed text. However, nothing in Beattie discloses or suggests that the process involves analyzing the user's audio input to identify "a plurality of audio locations" in the user's audio input, let alone discloses the generation of a mapping "between a plurality of audio locations in the audio data and a corresponding plurality of text locations in the textual version of the work" as recited in claim 1. App. Br. 21 ( emphasis omitted). As a matter of claim construction for the claim term "mapping," the Examiner properly looks to Appellants' Specification for context: Appellant[ s '] original specification describes a "mapping" as: "a mapping may be used to identify which displayed text corresponds to an audio recording of a person reading the text as the audio recording is being played and cause the identified text to be highlighted. Thus, while an audio book is being played, a user of an e-book reader may follow along as the e-book reader highlights the corresponding text (Original Specification, pg. 22, para. [0071]) Appellant[s'] original specification also describes a "mapping" as involving: an Audio-to-text correlator that associates the text location with the audio location to create a mapping record in document-to-audio mapping (Original Specification, pg. 17, para. [0056]) Ans. 24 ( emphasis omitted). 10 Appeal2017-007625 Application 13/267,738 Given Appellants' descriptions in the Specification (id.), we are not persuaded the Examiner's claim interpretation and reading of the claim term "mapping" on the cited features found in Beattie is overly broad, unreasonable, or inconsistent with the Specification, for the reasons explained by the Examiner in the Answer: Beattie discloses that as the student reads the passage (i.e. provide audio input), the tutor software of the system guides the student through the passage on a sentence-by-sentence basis using sentence-by-sentence tracking that provides a visual indication/highlighting for the entire sentence to be read by the user, where portions of the sentence are visually indicated until the speech recognition obtains an acceptable recognition i.e. audio input from the user, after which the visual indication progresses to a subsequent sentence or clause (Beattie, para. [0046]). Therefore, since Beattie discloses a plurality of text locations within a sentence and visually indicat[ es] the words (i.e. text location) when audio corresponding to the words (i.e. multiple audio locations) [is] provided, Beattie discloses "based on the text generated for the portions of the audio data, generating a mapping between a plurality of audio locations in the audio data and a corresponding plurality of text locations in the textual version of the work" consistent with appellant[s'] description of a "mapping." Also, Mayer discloses providing a correlation of hypertext (i.e. text locations) and phoneme (i.e. audio locations) sequences, measured in, for time seconds or words, where the sequence ofhyperlink words/phrases occurring within the duration of a given window are stored in a [ d]atabase along with phoneme models of such speech ( col. 6, ln 66 - col. 7, ln 15), also corresponding to "based on the text generated for the portions of the audio data, generating a mapping between a plurality of audio locations in the audio data and a corresponding plurality of text locations in the textual version of the work" also consistent with appellant[s'] description of a "mapping." Ans. 25 ( emphasis omitted). 11 Appeal2017-007625 Application 13/267,738 Regarding contested limitation L2 of claim 1, under Rejection A, on this record, and given the Examiner's broad but reasonable claim interpretation of the claim term "mapping," we find a preponderance of the evidence supports the Examiner's underlying factual findings and ultimate legal conclusion of obviousness. Id. The Combinability of Beattie and Mayer under 35 US. C. Â§ 103 (a) Appellants urge the Examiner has improperly combined Beattie with Mayer: Notwithstanding the Examiner's deficient rejection, combining Mayer's sliding window with the system of Beattie would render the system of Beattie unsatisfactory for its intended purpose. As discussed above, Beattie is intended to perform speech recognition on an audio input from a user, and, based on the speech recognition analysis, verify whether the user's oral reading matches the sentence. Importantly, Mayer's sliding window is designed to include only previously played hyperlink words. As such, to the extent that Mayer's design of the sliding window can be generalized, Mayer at most describes keeping track of a current playback position in the HTML page and constructing a sliding window containing only words occurring prior to the currently tracked position. As a result, incorporating this feature of Mayer into Beattie would render Beattie' s system inoperable. In Beattie, the system tracks a position between text that has been read by the user and text that has not been read by the user. Incorporating Mayer's teaching into Beattie would result in a sliding window containing only words that have been correctly read by the user (i.e., words occurring prior to the currently tracked position in Beattie). Such a sliding window of words would be essentially useless for recognizing subsequent audio inputs, which correspond to text that has not been read by the user. App. Br. 24 (footnotes and emphasis omitted). 12 Appeal2017-007625 Application 13/267,738 In response, the Examiner further explains the basis for the rejection: In this case, the Office Action (12/30/15, pg.16) clearly provides a motivation for combining the references and for utilizing the sliding window from knowledge found in the reference Mayer, and as required in Mayer, the motivation includes providing an easier way to implement speech recognition as a result of the small vocabulary of words that the speech can be recognized/translated into, as provided by Mayer (col. 4, ln 47-57; fig. 7). Ans. 27 ( emphasis omitted). The Examiner finds Appellants are bodily incorporating the teachings Mayer into Beattie. Id. We agree with the Examiner that Appellants' argument (App. Br. 24) appears to be premised on a "physical" or "bodily" incorporation of Mayer into Beattie. This is not the standard. See In re Sneed, 710 F.2d 1544, 1550 (Fed. Cir. 1983) ("[I]t is not necessary that the inventions of the references be physically combinable to render obvious the invention under review."); In re Keller, 642 F.2d 413,425 (CCPA 1981) ("The test for obviousness is not whether the features of a secondary reference may be bodily incorporated into the structure of the primary reference; ... Rather, the test is what the combined teachings of the references would have suggested to those of ordinary skill in the art."). Moreover, "[a] reference must be considered for everything it teaches by way of technology and is not limited to the particular invention it is describing and attempting to protect." EWP Corp. v. Reliance Universal Inc., 755 F.2d 898, 907 (Fed. Cir. 1985) (emphasis omitted). See also KSR Int'! Co. v. Teleflex Inc., 550 U.S. 398,419 (2007) ("[N]either the particular motivation nor the avowed purpose of the [Appellants] controls" in an obviousness analysis.). 13 Appeal2017-007625 Application 13/267,738 The Examiner provides an additional detailed explanation on page 28 of the Answer. We agree with and adopt the Examiner's detailed responses (id.), particularly: (1) that Mayer "would be an added benefit to Beattie's speech recognition system by restricting the recognition of the audio spoken by Beattie's user to only a small collection of words present in a highlighted passage of the entire book/work described in Beattie, thereby enabling easier/faster recognition," (2) that Appellants' analysis make no sense as to why it would be unsatisfactory to combine the teachings of Beattie with Mayer "since both references seek to limit/constrain/restrict the amount of words that recognition utilizes at any given point in time, where using the sliding window would be an added benefit" ( emphasis omitted), and (3) that Beattie (Abstract) and Mayer (Abstract) are analogous art within the same field of endeavor. Ans. 28. On this record, we find the Examiner provided sufficient articulated reasoning with some rational underpinning to establish why an artisan would have been motivated to modify Beattie with the teachings and suggestions of Mayer. Final Act. 5-6. 5 Moreover, Appellants do not provide evidence sufficient to demonstrate that combining the teachings of Beattie and Mayer, as proffered by the Examiner (id.), would have been "uniquely challenging or difficult for one of ordinary skill in the art," Leapfrog Enters., Inc. v. Fisher-Price, 5 The Supreme Court guides: "rejections on obviousness grounds cannot be sustained by rnere conclusory statements; instead, there must be some articulated reasoning with some rational underpinning to suppmi the legal conclusion of obviousness." KSR, 550 U.S. at 418 (quoting In re Kahn, 441 F.3d 977, 988 (Fed. Cir. 2006)). 14 Appeal2017-007625 Application 13/267,738 Inc., 485 F.3d 1157, 1162 (Fed. Cir. 2007), nor have Appellants provided any objective evidence of secondary considerations, which our reviewing court guides "operates as a beneficial check on hindsight," Cheese Systems, Inc. v. Tetra Pak Cheese and Powder Systems, Inc., 725 F.3d 1341, 1352 (Fed. Cir. 2013). For at least the aforementioned reasons, on this record, we find a preponderance of the evidence supports the Examiner's underlying factual findings and ultimate legal conclusion of obviousness regarding contested limitations LI and L2 of claim 1 as rejected under Rejection A. Claims 2, 6, 15-17, 21, and 30 (not argued separately) fall with representative claim 1. See supra "Grouping of Claims." Accordingly, we sustain the Examiner's Rejection A of claims 1, 2, 6, 15-17, 21, and 30. We address separately argued claims 3 and 18, infra. Dependent Claims 3 and 18 under Rejection A Dependent claim 3 recites: "The method of Claim 2, wherein generating text for portions of the audio data based, at least in part, on textual context of the work includes generating text based, at least in part, on one or more rules of grammar used in the textual version of the work" ( emphasis added). At the outset, we note that the argued "one or more rules of grammar" limitation that is recited in claim 3 ( and in claim 18 by virtue of its dependency upon claim 3), is not recited in claims 2, 6, 15, 16, 17, 21, and 30, which Appellants have improperly grouped together based upon claim 3 being purportedly representative of all claims in this group. See App. Br. 26. However, claims 2, 6, 15, 16, 17, 21, and 30 do not directly or indirectly 15 Appeal2017-007625 Application 13/267,738 depend from claim 3. Therefore, arguments not made are waived for claims 2, 6, 15, 16, 17, 21, and 30, as rejected by the Examiner under Rejection A. See 37 C.F.R. Â§ 4I.37(c)(l)(iv). Because Appellants do not separately argue claim 18 ( which depends from claim 3), we decide the appeal of Rejection A of dependent claims 3 and 18 on the basis of representative claim 3. See 3 7 C.F .R. Â§ 4I.37(c)(l)(iv). Regarding dependent claim 3, Appellants focus on the recited "one or more rules of grammar:" [D]ependent claim 3 recites, inter alia, "wherein generating text for portions of the audio data based, at least in part, on textual context of the work includes generating text based, at least in part, on one or more rules of grammar used in the textual version of the work." The Examiner relied on Beattie at paragraphs [0043] and [0045] and FIG. 3 for disclosing claim 3. However, Beattie' s disclosure does not support the Examiner's argument. In fact, these cited portions of Beattie do not even mention "rules of grammar". App. Br. 26 (footnote and emphasis omitted). We note there is no requirement in an obviousness analysis for the prior art to "contain a description of the subject matter of the appealed claim in ipsissimis verbis." In re May, 574 F.2d 1082, 1090 (CCPA 1978). Moreover, "the question under 35 USC 103 is not merely what the references expressly teach but what they would have suggested to one of ordinary skill in the art at the time the invention was made." Merck & Co. v. Biocraft Labs., Inc., 874 F.2d 804, 807 (Fed. Cir. 1989) (emphasis added) (quoting In re Lamberti, 545 F.2d 747, 750 (CCPA 1976)); see also MPEP Â§ 2123. 16 Appeal2017-007625 Application 13/267,738 Additionally, the skilled artisan is "[a] person of ordinary creativity, not an automaton." KSR, 550 U.S. at 421. "Every patent application and reference relies to some extent upon knowledge of persons skilled in the art to complement that [ which is] disclosed .... " In re Bode, 550 F .2d 656, 660 (CCPA 1977) (quoting In re Wiggins, 488 F.2d 538,543 (CCPA 1973)). Those persons "must be presumed to know something" about the art "apart from what the references disclose." In re Jacoby, 309 F.2d 513, 516 (CCPA 1962). As a matter of claim construction, we tum to the Specification for context: Different works may have significantly different textual contexts. For example, the grammar used in a classic English novel may be very different that the grammar of modem poetry. Thus, while a certain word order may follow the rules of one grammar, that same word order may violate the rules of another grammar. Similarly, the grammar used in both a classic English novel and modem poetry may differ from the grammar ( or lack thereof) employed in a text message sent from one teenager to another. Spec. ,r 38. In response, the Examiner further explains the basis for the rejection: As provided in the Office Action rejection of Claim 3, correctly pronounced English story text including English words are based on rules of English grammar ( Office Action, 12/30/15, pg. 16-17). Appellant[s'] arguments do not mention/include any statements as to why a speech recognition system that recognizes English text from received English audio fail to utilize English rules of grammar. In fact, [ A Jppellant' s arguments fail to mention/refer to the word "English" other than the careful reproduction of the text of the original rejection that includes the phrase "English language". 17 Appeal2017-007625 Application 13/267,738 Nevertheless, Beattie discloses a speech recognition engine used to generate speech recognition result (i.e. text) for the audio received by utilizing acoustic models, language models and a pronunciation dictionary (para. [0043]) that are based on the English language (para. [0044]-[0045]; para. [0092]). Beattie further discloses that the text in the pronunciation dictionary comes directly from the story texts/passages of the English story texts (fig. 3; para. [0045]). Beattie, for claim 2 from which claim 3 depends, also discloses the use of the language model based on where the user is reading in the text and the use of the context of the user's reading (Beattie, para. [0066] and rejection of claim 2). Therefore Beattie discloses performing speech recognition to generate recognition results/text for portions of the received user speech/ audio data that is based on textual context of the story book/work, where the generating includes generating text based on the English language story text (fig. 3). The English language story text as provided in figure 3 show an arrangement of words and phrases to create well-formed sentences in English language (i.e. one more rules of grammar), and hence, the claim limitation. Furthermore, Beattie discloses analyzing received audio to determine if such audio correspond to text within a syntactic boundary of the text/textual version (para. [0091]) i.e. a context syntactic analysis. Since determination of syntactic boundaries involve a syntactic/syntax (universally defined as "rules of grammar") analysis of the English text, and "syntax" is further defined as the arrangement of words and phrases to create well- formed sentences in a language ( story text in fig. 3 of Beattie clearly shows arrangement of words and phrases to create well- formed sentences in English language), the examiner maintains that Beattie discloses "wherein generating text for portions of the audio data based, at least in part, on textual context of the work includes generating text based, at least in part, on one or more rules of grammar used in the textual version of the work". Ans. 30-31 ( emphasis omitted). 18 Appeal2017-007625 Application 13/267,738 Although Appellants contend in the Reply Brief (15-17) that the Examiner has failed to set forth a prima facie case, we disagree. Based upon our review of the record, we find the Examiner has met the burden of establishing the notice requirement of 35 U.S.C. Â§ I32(a) for Rejection A of dependent claim 3. 6 On this record, we find a preponderance of the evidence supports the Examiner's underlying factual findings and ultimate legal conclusion of obviousness regarding the contested "one or more rules of grammar" limitation recited in dependent claim 3, as rejected under Rejection A. Claim 18 (not argued separately) falls with representative claim 3, from which it depends. Accordingly, we sustain the Examiner's Rejection A of dependent claims 3 and 18. Contested Limitations LI and L2 of Independent Claim 14 under Rejection B Appellants contest the Examiner's findings regarding the following limitations (LI and L2), as recited in representative independent claim 14: based on [LI] (1) a comparison of the first audio data with the second audio data and [L2] (2) the first mapping record, generating a second mapping record correlating a second plurality of audio locations in the second audio data 6 The Federal Circuit guides "the prima facie case is merely a procedural device that enables an appropriate shift of the burden of production." Hyatt v. Dudas, 492 F.3d 1365, 1369 (Fed. Cir. 2007). This burden is met by "adequately explain[ing] the shortcomings it perceives so that the applicant is properly notified and able to respond." Id. at 1370. It is only "when a rejection is so uninformative that it prevents the applicant from recognizing and seeking to counter the grounds for rejection" that the prima facie burden has not been met and the rejection violates the minimal requirements of 35 U.S.C. Â§ I32(a). Chester v. Miller, 906 F.2d 1574, 1578 (Fed. Cir. 1990). 19 Appeal2017-007625 Application 13/267,738 and the plurality of text locations in the textual version of the work[.] Claim 14 ( emphasis added); see App. Br. 29. Appellants contend: Adams and Heckerman, alone or in combination, do not disclose or suggest "based on (1) a comparison of the first audio data with the second audio data and (2) the first mapping record, generating a second mapping record correlating a second plurality of audio locations in the second audio data and the plurality of text locations in the textual version of the work" as recited in claim 14. App. Br. 29--30 (boldface omitted). Appellants note the Examiner relies on "Adams, at column 3, line 57 to column 5, line 4; column 7, lines 9-19; and column 7, lines 20-44, to disclose the above-recited features of claim 14." App. Br. 30. Appellants explain that, in a given lesson, Adam's "computer and the user read different text segments, and the user's reading and the computer's reading are not compared with each other. Such a comparison would be counterintuitive at best as it could not be used to verify the user's reading." App. Br. 37 ( emphasis omitted). 7 We disagree. Regarding the contested first and second mapping records recited in independent claim 14, we begin our analysis by turning to the Specification for context: According to one approach, a mapping (whether created manually or automatically) is used to identify the locations within an audio version of a digital work (e.g., an audio book) 7 Appellants also note: "Heckerman does not cure the deficiencies of Adams. The Examiner does not rely on Heckerman for disclosing 'a comparison of the first audio data with the second audio data' as recited in claim 14." Id. 20 Appeal2017-007625 Application 13/267,738 that correspond to locations within a textual version of the digital work ( e.g., an e-book). For example, a mapping may be used to identify a location within an e-book based on a "bookmark" established in an audio book. As another example, a mapping may be used to identify which displayed text corresponds to an audio recording of a person reading the text as the audio recording is being played and cause the identified text to be highlighted. Spec. ,r 71 ( emphasis added). The operation of Adams is explained in pertinent part at column 7, lines 20-41: The positional pacer 17, with input from the local text position management databases 22, may be implemented to continuously prompt the student along the text, identifying (e.g., "follow the bouncing ball") the word that is being read by the computer instructor or is to be pronounced by the student. In addition, the executive program may highlight or color differentiate the text to be read by the computer and that to be read by the student. At each juncture of the lesson at which the student is to utter one or more text segments, the user's input is provided to the speech recognition interface 2 as described above. Correct input may be passively acknowledged, simply by the progress of the lesson to the next text segment, or actively acknowledged by audio output generated via access to the feedback message database 21. Rapid progress, as sensed by the positional pacer, will cause the positional pacer to increase the tempo of the lesson. If the user correctly pronounces the "current" ( e.g, highlighted) word before the pacer reaches the boundary between that word and the next word (the right boundary for standard left-to-right English reading), the pacer will accelerate slightly. Similarly, if the pacer reaches the boundary before the user pronounces the word, the pacer will slow its overall rate slightly. Adams col. 7, 11. 20-41 ( emphasis added). In the Answer, the Examiner provides a detailed explanation in support of the rejection: 21 Appeal2017-007625 Application 13/267,738 Adams is directed to simultaneously providing displayed text and audio corresponding to the displayed text while receiving a second audio from a student user, wherein an executive program in used in generating synthetic audio (i.e. a text-to speech analysis) corresponding to displayed text (Col. 7, ln 9- 19), where the generated synthetic and simultaneously provided audio corresponds to appellant[s'] "a first audio". Adams further discloses the system prompting a student user to utter one or more of the displayed text words, and determining if the student user's presented speech/audio (i.e. a second audio) is a correct pronunciation of the words displayed, and providing an indication if the words were correctly pronounced ( col. 7, ln 20- 44 ). , corresponding to "a comparison of the first audio data with the second audio data". Therefore Adams discloses providing "a first audio" and "a second audio", and since Adams already discloses the "first mapping" by way of the synchronized audio and text locations as provided above ( which appellant does not dispute), and also discloses (col. 7, ln 20-44) determining whether a correct pronunciation is received by the student user, where correct pronunciation/audio gamers an indication (i.e. color highlighting or non-highlighting of the text and the positional pacer moving to the next text location), Adams discloses "a second mapping" correlating locations of displayed text with provided audio and dependent on the already synchronized. [ (blank line inserted).] Since Adams discloses the color highlighting or non- highlighting of displayed text and the positional pacer moving to the next text locations upon the user providing the "correct" pronunciation/ audio for the previous text locations, Adams discloses linking/ correlating portions/locations of the student provided audio with the locations of the displayed text via the color highlighting or non-highlighting, and as a result, discloses "based on ( 1) a comparison of the first audio data with the second audio data and (2) the first mapping record generating a second mapping record correlating a second plurality of audio locations in the second audio data and the plurality of text locations in the textual version of the work." Ans. 34--35 ( emphasis omitted). 22 Appeal2017-007625 Application 13/267,738 In the Reply Brief, Appellants contend: "Adams describes that the user's reading is checked via speech recognition, not a comparison with the reading by the computer instructor. In other words, Adams' ability to check whether the user's pronunciations are correct does not disclose or suggest 'a comparison of the first audio data with the second audio data."' Reply Br. 21. However, the Examiner explains an additional way that Adams teaches or suggests the contested LI and L2 limitations of claim 14: Adams discloses the use of a session database that provides a replay of the joint reading of the text by the companion (i.e. the executive program) and the student that includes the portions read correctly by the student, and the portions read by the computer instructor/ companion while allowing the student to resume/replay the session at a later time, and enabling storage of the session (col. 4, ln 17-28; col. 9. ln 51-65). Ans. 35 (see continued detailed explanation on pages 36 and 37). As a matter of claim interpretation, the Examiner notes that the Specification describes a "mapping record" as follows: "Therefore, each mapping record in the mapping may include "highlighting data" that indicates how the text identified by the corresponding text location is to be highlighted. Thus, for each mapping record in the mapping that the media player identifies and that includes highlighting data, the media player uses the highlighting data to determine how to highlight the text" (pg. 35, para. [0124]). Ans. 39 ( emphasis omitted). For example, the Examiner further finds: Adams discloses the system determining if a user's presented audio (i.e. a second audio) is a correct pronunciation of the words displayed, and providing an indication/highlighting if the words were correctly displayed (col. 7, ln 20-44) where the indication/highlighting corresponds to the "second mapping 23 Appeal2017-007625 Application 13/267,738 record" consistent with appellant[s'] reproduced disclosure provided above, and also consistent with the claim language since the first mapping/synchronization links visual text segments with audio text segments in the audio clip database, and the highlighting/non highlighting of text (i.e. second mapping record) based on the subsequently provided student user's audio would be based on the already synchronized visual text segments with audio text segments aka the first mapping record. Ans. 39 ( emphasis omitted). In reviewing the record, we agree that Adams is directed to a different purpose than Appellants' invention. However, as previously noted, the Supreme Court guides that "neither the particular motivation nor the avowed purpose of the [Appellants] controls" in an obviousness analysis. KSR, 550 U.S. at 419. Even though Adams is directed to a different purpose than Appellants' invention, the Supreme Court provides further guidance that we find is dispositive here: "[A] combination of familiar elements according to known methods is likely to be obvious when it does no more than yield predictable results." Id. at 416. Therefore, on this record, we find a preponderance of the evidence supports the Examiner's detailed explanations, underlying factual findings, and ultimate legal conclusion of obviousness for independent claim 14. Ans. 31--41. Accordingly, we sustain the Examiner's Rejection B of independent claim 14, and dependent claim 29 (not argued separately) that falls therewith. See 37 C.F.R. Â§ 4I.37(c)(l)(iv). Rejection C underÂ§ 103(a) of Claims 7-13 and 22-28 Appellants have not substantively and separately contested the remaining claims rejected under Rejection C. Arguments not made are 24 Appeal2017-007625 Application 13/267,738 waived. See 37 C.F.R. Â§ 41.37(c)(l)(iv). Accordingly, we sustain the Examiner's Rejection C under 35 U.S.C. Â§ 103(a) of dependent claims 7-13 and 22-28. CONCLUSION The Examiner did not err in rejecting claims 1-3, 6-18, and 21-30 under pre-AIA 35 U.S.C. Â§ 103(a). DECISION We affirm the Examiner's decision rejecting claims 1-3, 6-18, and 21-30 under pre-AIA 35 U.S.C. Â§ 103(a). No time period for taking any subsequent action in connection with this appeal may be extended under 3 7 C.F .R. Â§ 1.13 6( a )(1 )(iv). See 37 C.F.R. Â§ 4I.50(f). AFFIRMED 25